Array and uses thereof

ABSTRACT

An array of nucleic acid probes is described for identifying and/or characterizing a pathotype of a microorganism. Methods are also described for detecting the presence of a microorganism in a sample, as well as determining its pathotype, using the array. Methods of assessing related infection and disease in a subject using the array are also described.

FIELD OF THE INVENTION

[0001] The invention relates to an array and uses thereof and particulary relates to an array for pathotyping a microorganism and uses thereof.

BACKGROUND OF THE INVENTION

[0002] A variety of pathogenic microorganisms exist, which pose a continued health threat. An example is the bacterium Escherichia coli, which is commonly found in the environment as well as in the digestive tracts of common animal species including humans. Individual strains within Escherichia coli (E. coli) can vary in pathogenicity from innocuous to highly lethal, as evidenced by incidents of its contamination of drinking water and outbreaks of so-called hamburger disease. The pathogenicity of a given E. coli depends on the presence or absence of virulence genes within its genome. These virulence genes are ideal targets for the determination of the pathogenicity potential of any given E. coli isolate.

[0003] Numerous molecular methods have been used for detecting and identifying pathogenic E. coli. However, these approaches suffer from a variety of limitations, the, most serious of which is related to the large variety of virulence factors distributed among the known pathotypes. Currently, there is no practical, cost-effective way to determine rapidly and simultaneously the presence or absence of this large set of these virulence genes within a given E. coli strain.

[0004] It would therefore be desirable to have improved methods and materials for the detection of pathogenic microorganisms, such as bacteria (e.g. E. coli) .

SUMMARY OF THE INVENTION

[0005] The invention relates to a collection of probes, e.g. in an array format, and uses thereof.

[0006] Accordingly, in a first aspect, the invention provides an array comprising: a substrate; and a plurality of nucleic acid probes, each of the probes being bound to the substrate at a discrete location; the plurality of probes comprising a first probe for a first pathotype of a species of a microorganism and a second probe for a second pathotype of the species, wherein the first and second pathotypes are not identical. In an embodiment, the array comprises at least 103 distinct nucleic acid probes. In embodiments, each of the probes are independently greater than or equal to 20, 50 or 100 nucleotides in length. In an embodiment, the array comprises at least two probes for a single pathotype, wherein the two probes are not identical. In an embodiment, the array comprises a subarray, wherein the subarray comprises the at least two probes at adjacent discrete locations on the substrate.

[0007] In an embodiment, the plurality of probes comprises, first, second, third and fourth probes for respective first, second, third and fourth pathotypes of the species, wherein the first, second, third and fourth pathotypes are not identical. In a further embodiment, the plurality of probes comprises, first, second, third, fourth, fifth and sixth probes for respective first, second, third, fourth, fifth and sixth pathotypes of the species, wherein the first, second, third, fourth, fifth and sixth pathotypes are not identical. In yet a further embodiment, the plurality of probes comprises, first, second, third, fourth, fifth, sixth, seventh and eighth probes for respective first, second, third, fourth, fifth, sixth, seventh and eighth pathotypes of the species, wherein the first, second, third, fourth, fifth, sixth, seventh and eighth pathotypes are not identical.

[0008] In an embodiment, the probe is for a virulence gene or fragment thereof or a sequence substantially identical thereto, wherein the virulence gene is associated with pathogenicity of the microorganism.

[0009] In an embodiment, the microorganism is a bacterium, in a further embodiment, of the family Enterobacteriaceae, in a further embodiment, the bacterium is E. coli.

[0010] In an embodiment, the first and second pathotypes each independently comprise a pathotype selected from the group consisting of: enterotoxigenic E. coli (ETEC); enteropathogenic E. coli (EPEC); enterohemorrhagic E. coli (EHEC); enteroaggregative E. coli (EAEC); enteroinvasive E. coli (EIEC); uropathogenic strains (UPEC); E. coli strains involved in neonatal meningitis (MENEC); E. coli strains involved in septicemia (SEPEC); cell-detaching E. coli (CDEC); and diffusely adherent E. coli (DAEC).

[0011] In an embodiment, the first pathotype is selected from the group consisting of: enteroaggregative E. coli (EAEC); enteroinvasive E. coli (EIEC); E. coli strains involved in neonatal meningitis (MENEC); E. coli strains involved in septicemia (SEPEC); cell-detaching E. coli (CDEC); and diffusely adherent E. coli (DAEC).

[0012] In an embodiment, the virulence gene encodes a polypeptide of a class of proteins selected from the group consisting of toxins, adhesion factors, secretory system proteins, capsule antigens, somatic antigens, flagellar antigens, invasins, autotransporter proteins, and aerobactin system proteins. In an embodiment, the virulence gene is selected from the group consisting of afaBC3, afaE5, afaE7, afaD8, aggA, aggC, aida, bfpA, bmaE, cdt1, cdt2, cdt3, cfaI, clpG, cnf1, cnf2, cs1, cs3, cs31a, cvaC, derb122,eae, eaf, east1, ehxA, espA group I, espA group II, espA group III, espB group I, espB group II, espB group III, espC, espP, etpD, F17A, F17G, F18, F4, F41, F5, F6, fimA group I, fimA group II, fimH, fliC, focG, fyuA, hlyA, hlyC, ibe10, iha, invX, ipaC, iroN, irp1, irp2, iss, iucD, iutA, katP, kfiB, kpsMTII, kpsMTIII, 17095, leoA, IngA, lt, neuC, nfaE, ompA, ompT, paa, papAH, papC, papEF, papG group I, papG group II, papG group III, pai, rfbO9, rfbO101, rfbO111, rfbE O157, rfbE O157 H7, rfc O4, rtx, sfaDE, sfaA, stah, stap, stb, stx1, stx2, stxA I, stxA II, stxB I, stx B II, stxB III, tir group I, tir group II, tir group III, traT, and tsh genes. In an embodiment, the above-noted probe comprises a nucleic acid sequence selected from the group consisting of SEQ ID NO:1 to SEQ ID NO:102, or a fragment thereof, or a sequence substantially identical thereto.

[0013] In an embodiment, the substrate is selected from the group consisting of a porous support and a support having a non-porous surface. In embodiments the support is selected from the group consisting of a slide, chip, wafer, membrane, filter and sheet. In an embodiment, the slide comprises a coating capable of enhancing nucleic acid immobilization to the slide. In an embodiment, the probes are covalently attached to the substrate.

[0014] The invention further provides a method of detecting the presence of a microorganism in a sample, the method comprising: contacting the above-mentioned array with a sample nucleic acid of the sample; and detecting association of the sample nucleic acid to a probe on the array; wherein association of the sample nucleic acid with the probe is indicative that the sample comprises a microorganism from which the nucleic acid sequence of the probe is derived. In an embodiment, the sample nucleic acid comprises a label. In an embodiment, the label is a fluorescent dye (e.g. a cyanine, a fluorescein, a rhodamine and a polymethine dye derivative). In an embodiment, the method further comprises extracting the sample nucleic acid from the sample before contacting it with the array. In an embodiment, the sample nucleic acid is not amplified by PCR prior to contacting it with the array. In an embodiment, the method further comprises digesting the sample nucleic acid with a restriction enzyme to produce fragments of the sample nucleic acid prior to contacting with the array. In an embodiment, the fragments are of an average size of about 0.2 Kb to about 12 Kb. In an embodiment, the method further comprises labelling the sample nucleic acid prior to contacting it with the array. In an embodiment, the sample nucleic acid is selected from the group consisting of DNA and RNA.

[0015] In an embodiment, the above-mentioned sample is selected from the group consisting of environmental samples, biological samples and food. In an embodiment, the environmental samples are selected from the group consisting of water, air and soil. In an embodiment, the biological samples are selected from the group consisting of blood, urine, amniotic fluid, feces, tissues, cells, cell cultures and biological secretions, excretions and discharge.

[0016] In an embodiment, the method is further for determining a pathotype of a species of the microorganism, wherein the probe is for a pathotype of the species and wherein association of the sample nucleic acid with the probe is indicative that the microorganism is of the pathotype.

[0017] In an embodiment, the sample is a tissue, body fluid, secretion or excretion from a subject and the method is further for diagnosing an infection by the microorganism in the subject, wherein association of the nucleic acid with the probe is indicative that the subject is infected by the microorganism.

[0018] In an embodiment, the method is for diagnosing a condition related to infection by the microorganism in the subject, wherein the probe is for a pathotype of the species and wherein association of the sample nucleic acid with the probe is indicative that the microorganism is of the pathotype and that the subject suffers from a condition associated with the pathotype. In an embodiment, the condition is selected from the group consisting of: diarrhea, hemorrhagic colitis, hemolytic uremic syndrome, invasive intestinal infections, dysentery, urinary tract infections, neonatal meningitis and septicemia. In an embodiment, the subject is a mammal, in a further embodiment, a human.

[0019] The invention further provides a commercial package comprising the above-mentioned array together with instructions for: (a) detecting the presence of a microorganism in a sample; (b) determining the pathotype of a microorganism in a sample; (c) diagnosing an infection by a microorganism in a subject; (d) diagnosing a condition related to infection by a microorganism, in a subject; or (e) any combination of (a) to (d).

[0020] The invention further provides a use of the above-mentioned array for: (a) detecting the presence of a microorganism in a sample; (b) determining the pathotype of a microorganism in a sample; (c) diagnosing an infection by a microorganism in a subject; (d) diagnosing a condition related to infection by a microorganism, in a subject; or (e) any combination of (a) to (d).

[0021] The invention further provides a method of producing an array for pathotyping a microorganism in a sample, the method comprising: providing a plurality of nucleic acid probes, the plurality of probes comprising a first probe for a first pathotype of a species of the microorganism and a second probe for a second pathotype of the species, wherein the first and second probes are different; and applying each of the plurality of probes to a different discrete location of a substrate. In an embodiment, the method further comprises the step of crosslinking by exposure of the array to ultraviolet radiation. In an embodiment, the method further comprises heating the array subsequent to the crosslinking.

[0022] The invention further provides a method of producing an array for pathotyping a microorganism in a sample, the method comprising: selecting a plurality of nucleic acid probes, the plurality of probes comprising a first probe for a first pathotype of a species of the microorganism and a second probe for a second pathotype of the species, wherein the first and second probes are different; and synthesizing each of the plurality of probes at a different discrete location of a substrate.

BRIEF DESCRIPTION OF THE DRAWINGS

[0023]FIG. 1: Print pattern of the E. coli pathotype microarray according to an embodiment of the invention. (A) Grouping of genes by category (B) Location of the individual genes.

[0024]FIG. 2: Detection of virulence genes and simultaneous identification of the pathotype of known E. coli strains after microarray hybridization with genomic DNA from (A) a nonpathogenic K-12 E. coli strain DH5α (B) an enterohemorrhagic strain EDL933 0157:H7 (C) an uropathogenic strain J96, 04:K6 and (D) an enterotoxigenic strain H-10407. Genomic DNA after HindIII/EcoRI digestion was labeled with Cy3. Labeled DNA (500 ng) was hybridized to the array overnight at 42° C., washed, dried and scanned. Boxed spots in Panel A represent the virulence genes present in K-12 E. coli strain DH5α (traT, fimA, fimH, ompA, ompT, iss, fliC). Boxed spots in Panels B, C and D indicate the pathotype-specific genes in the tested strains. Genes present in more than one pathotype (iss, irp2, fliC, ompT) or present in all the pathotypes (fimH, fimA, ompA) gave a positive signal. The horizontal bar indicates the color representation of fluorescent-signal intensity.

[0025]FIG. 3: Virulence potential analysis of E. coli strains isolated from clinical samples using a E. coli pathotype microarray according to an embodiment of the invention. (A) Hybridization of genomic DNA from an avian E. coli isolate Av01-4156 (B) Hybridization pattern obtained with genomic DNA from a bovine strain B00-4830 (C) Hybridization of genomic DNA from a human E. coli isolate H87-540. Labeled DNA (500 ng) was hybridized to the array overnight at 42° C. after which the slide was washed, dried and scanned. Boxed spots indicate the pathotype-specific genes: iucD, iron, traT and iutA in panel A, etpD, F5, stap, and traT in panel B, stx1, cdt2, cdt3, afaD8, bmaE, iucD, iroN, and iutA in Panel C. Positive signals were also obtained with genes present in more than one pathotype (espP, iss, ompT, fliC) and genes present in all the tested pathotypes (fimA, fimH, ompA).

[0026]FIG. 4: Detection of stx and cnf variant genes in clinical isolates of E. coli using a pathotype microarray according to an embodiment of the invention. The white boxes in Panel A outlines the stx genes hybridized with (1) the human strain H87-5406 and (2) the bovine strain B99-4297. The white boxes in Panel B outlines the cnf genes hybridized with (1) strain Ca01-E179 and (2) strain H87-5406. Labeled DNA (500 ng) was hybridized to an array overnight at 42° C. after which the slide was washed, dried and scanned.

[0027]FIG. 5: Use of an E. coli pathotype microarray according to an embodiment of the invention to identify the phylogenetic group of E. coli strains on the basis of their hybridization pattern with the attaching and effacing gene probes (A) print pattern of espA, espB and tir probes on the pathotype microarray with the homology percentages between each immobilized probe (B) detection of espA3, espB2 and tir3 in the human EPEC strain E2348/69 (C) hybridization pattern obtained with genomic DNA from the animal EPEC strain P86-1390 (espA1, espB3 and tir1 (D) detection of espA2, espB1 and tir2 in the EHEC strain EDL933. The positive hybridization results obtained with espA, espB and tir probes are outlined in white boxes.

[0028]FIG. 6: Schematic of virulence gene DNA microarray for Escherichia coli according to an embodiment of the invention. The number and alignment of DNA probes within sub-arrays and of sub-arrays within the microarray can vary as required. The embodiment illustrated depicts a subarray of 12 different gene probes (g1-g12), each being spotted twice. The 24 subarrays shown would represent 24×12=288 distinct virulence genes.

[0029]FIG. 7: Schematic representation of a method of use of a virulence gene microarray according to an embodiment of the invention.

DETAILED DESCRIPTION OF THE INVENTION

[0030] The present invention provides products and methods for the detection and characterization of microorganisms, such as bacteria, (e.g. of the family Enterobacteriaceae) such as E. coli. The products and methods of the invention can be used to detect the presence of such a microorganism in a sample (e.g. a biological or environmental sample). Further, such products and methods can be used to characterize such a microorganism, e.g. determining/characterizing its pathotype.

[0031] Pathogenic E. coli are responsible for three main types of clinical infections (a) enteric/diarrheal disease (b) urinary tract infections and (c) sepsis/meningitis. On the basis of their distinct virulence properties and clinical symptoms of the host, pathogenic E. coli are divided into numerous categories or pathotypes. The diarrheagenic E. coli include (i) enterotoxigenic E. coli (ETEC) associated with traveller's diarrhea and porcine and bovine diarrhea, (ii) enteropathogenic E. coli (EPEC) causing diarrhea in children and animals, (iii) enterohemorrhagic E. coli (EHEC) associated with hemorrhagic colitis and hemolytic uremic syndrome in humans, (iv) enteroaggregative E. coli (EAEC) associated with persistent diarrhea in humans, and (v) enteroinvasive E. coli (EIEC) involved in invasive intestinal infections, watery diarrhea and dysentery in humans and animals (71). Extra-intestinal infections are caused by three separate E. coli pathotypes (i) uropathogenic strains (UPEC) that cause urinary tract infections in humans, dogs and cats (8, 36, 87) (ii) strains involved in neonatal meningitis (MENEC) (87) and (iii) strains that cause septicemia in humans and animals (SEPEC) (25, 41, 66, 87).

[0032] Numerous bioassays and molecular methods have been developed for the detection of genes involved in pathogenic E. coli virulence mechanisms. However, the sheer numbers of known virulence factors have made this a daunting task. As described herein, microarray technology offers the most rapid and practical tool to detect the presence or absence of a large set of virulence genes simultaneously within a given E. coli strain. Prior to applicants' findings herein, only a few studies have reported the use of microarrays as a diagnostic tool (16, 18, 19, 63, 70). Described herein is a new approach for detection of a large number of virulence factors present in E. coli strains and the subsequent determination of the strain's pathotype. As described herein, nucleic acid sequences derived from most known virulence factors including associated-virulence genes were amplified by PCR and immobilized onto glass slides to create a virulence DNA microarray chip. Probing this virulence gene microarray with labeled genomic E. coli DNA, the virulence pattern of a given strain can be assessed and its pathotype determined in a single experiment.

[0033] As a practical example in support of this invention, an E. coli virulence factor microarray was designed and tested. It was of course recognized that applications of this microarray reach far into human health, drinking water and environmental research.

[0034] According to another aspect of the invention, a method is provided for analyzing a given liquid culture or colony of bacteria simultaneously for the presence of a number of these virulence genes in the same experiment.

[0035] In embodiments, an array of virulence genes may be used by reference laboratories involved in public or veterinary health. A simplified format of the microarray focusing on a few key virulence genes could find a broader market in routine medical or veterinary microbiological laboratory work.

[0036] Other types of virulence genes may be represented on such an array for a variety of applications. For example, the armed forces may be interested in implementing this type technology for detection and/or identification of biological warfare agents.

[0037] The invention thus relates to products and methods which enable the parallel analysis in respect of a plurality of pathotypes of a microorganism(s), via the use of a collection of a plurality of nucleic acid probes derived from virulence genes of the microorganism(s), the collection corresponding to a plurality of pathotypes of the microorganism(s). In embodiments, the plurality of pathotypes may comprise at least 2, 3, 4, 5, 6, 7, 8, 9 or 10 pathotypes.

[0038] Accordingly, in an aspect, the invention relates to a collection comprising a plurality of probes, the probes being derived from genetic/protein (e.g. a virulence gene) material/information from a microorganism and correspond to a plurality of pathotypes of the microorganism. In an embodiment, the probes comprise a nucleic acid sequence derived from a microorganism or a sequence substantially identical thereto. In an embodiment, the collection can represent more than one microorganism.

[0039] “Pathotype” as used herein refers to the classification of a particular strain of a microorganism by virtue of the pathogenic phenotype it may manifest when it infects a subject. A plurality of strains may thus be grouped in the same pathotype if the strains are capable of resulting in the same phenotypic manifestation (e.g. disease symptoms) when they infect a subject. In the case of E. coli, for example, pathotypes may include those associated with intestinal and extraintestinal conditions. Such pathotypes include but are not limited to ETEC, EPEC, EHEC, EAEC, EIEC, UPEC, MENEC, SEPEC, CDEC and DAEC noted herein. As described herein, a pathotype may be identified and/or characterized using a probe based on a virulence gene associated with the pathotype, in a particular microorganism. “Virulence gene” as used herein refers to a nucleic acid sequence of a microorganism, the presence and/or expression of which correlates with the pathogenicity of the microorganism. In the case of bacteria, such virulence genes may in an embodiment comprise chromosomal genes (i.e. derived from a bacterial chromosome), or in a further embodiment comprise a non-chromosomal gene (i.e. derived from a bacterial non-chromosomal nucleic acid source, such as a plasmid) . In the case of E. coli, examples of virulence genes and classes of polypeptides encoded by such genes are described below. Virulence genes for a variety of pathogenic microorganisms are known in the art.

[0040] Two probes which are “not identical” as used herein denotes two probes that have at least one structural difference. The difference may for example comprise an addition, deletion or substitution of one or more nucleotides or a rearrangement within its nucleotide sequence. Two pathotypes which are “not identical” as described herein denotes two classifications of pathogenic microorganisms that are sufficiently different to result in recognizably different pathogenic phenotypic manifestations when infecting a subject.

[0041] In an embodiment, the above-noted collection is in the form of an array, whereby the probes are bound to different, discrete locations of a substrate. The length of the probes may be variable, e.g. at least 20, 50, 100, 500, 1000 or 2000 nucleotides in length. High density nucleic acid probe arrays, also referred to as “microarrays,” may for example be used to detect and/or monitor the expression of a large number of genes, or for detecting sequence variations, mutations and polymorphisms. Microfabricated arrays of large numbers of oligonucleotide probes, (variously described as “biological chips”, “gene chips”, or “DNA chips”), allow the simultaneous nucleic acid hybridization analysis of a target DNA molecule with a very large number of oligonucleotide probes. In one aspect, the invention provides biological assays using such high density nucleic acid or protein probe arrays. For the purpose of such arrays, “nucleic acids” may include any polymer or oligomer of nucleosides or nucleotides (polynucleotides or oligonucleotidies), which include pyrimidine and purine bases, preferably cytosine, thymine, and uracil, and adenine and guanine, respectively. Polymers or oligomers of deoxyribonucleotides or ribonucleotides may be used, which may contain naturally occurring or modified bases, and which may contain normal internucleotide bonds or modified (e.g. peptide) bonds. A variety of methods are known for making and using microarrays, as for example disclosed in Cheung, V. G. et al. (1999) Nature Genetics Supplement, 21, 15-19; Lipshutz, R. .J. et al., (1999) Nature Genetics Supplement, 21, 20-24; Bowtell, D.D.L. (1999) Nature Genetics Supplement, 21, 25-32; Singh-Gasson, S. et al. (1999) Nature Biotechnol. 17, 974-978; and, Schweitzer, B. et al. (2002) Nature Biotechnol. 20, 359-365; all of which are incorporated herein by reference. DNA chip technology is described in detail in, for instance, U.S. Pat. No. 6,045,996 to Cronin et al., U.S. Pat. No. 5,858,659 to Sapolsky et al., U.S. Pat. No. 5,843,655 to McGall et al., U.S. Pat. No. 5,837,832 to Chee et al., and U.S. Pat. No. 6,110,426 to Shalon et al., all of which are specifically incorporated herein by reference. Suitable DNA chips are available for example from Affymetrix, Inc. (Santa Clara, Calif.).

[0042] Methods for storing, querying and analyzing microarray data have for example been disclosed in, for example, U.S. Pat. No. 6,484,183 issued to Balaban, et al. Nov. 19, 2002; and U.S. Pat. No. 6,188,783 issued to Balaban, et al. Feb. 13, 2001; Holloway, A. J. et al., (2002) Nature Genetics Supplement, 32, 481-489; each of which is incorporated herein by reference.

[0043] DNA chips generally include a solid substrate or support, and an array of oligonucleotide probes immobilized on the substrate. The substrate can be, for example, silicon or glass, and can have the thickness of a glass microscope slide or a glass cover slip. Substrates that are transparent to light are useful when the method of performing an assay on the chip involves optical detection. Suitable substrates include a slide, chip, wafer, membrane, filter, sheet and bead. The substrate can be porous or have a non-porous surface. Preferably, oligonucleotides are arrayed on the substrate in addressable rows and columns. A “subarray” may thus be designed which comprises a particular grouping of probes at a particular area of the array, the probes immobilized at adjacent locations or within a defined region of the array. A hybridization assay is performed to determine whether a target DNA molecule has a sequence that is complementary to one or more of the probes immobilized on the substrate. Because hybridization between two nucleic acids is a function of their sequences, analysis of the pattern of hybridization provides information about the sequence of the target molecule. DNA chips are useful for discriminating variants that may differ in sequence by as few as one or a few nucleotides.

[0044] Hybridization assays on the DNA chip involve a hybridization step and a detection step. In the hybridization step, a hybridization mixture containing the labelled target nucleic acid sequence is brought into contact with the probes of the array and incubated at a temperature and for a time appropriate to allow hybridization between the target and any complementary probes. The array may optionally be washed with a wash mixture which does not contain the target (e.g. hybridization buffer) to remove unbound target molecules, leaving only bound target molecules. In the detection step, the probes to which the target has hybridized are identified. Since the nucleotide sequence of the probes at each feature is known, identifying the locations at which target has bound provides information about the particular sequences of these probes.

[0045] Hybridization may be carried out under various conditions depending on the circumstances and the level of stringency desired. Such factors shall depend on the specificity and degree of differentiation between target sequences for any given analysis. For example, to distinguish target sequences which differ by only one or a few nucleotides, conditions of higher stringency are generally desirable. Stringency may be controlled by factors such as the content of hybridization and wash solutions, the temperature of hybridization and wash steps, the number and duration of hybridization and wash steps, and any combinations thereof. In embodiments, the hybridization may be conducted at temperatures ranging from about 4° C. up to about 80° C., depending on the length of the probes, their G+C content and the degree of divergence to be detected. If desired, denaturing reagents such as formamide may used to decrease the hybridization temperature at which perfect matches will dissociate. Commonly used conditions involve the use of buffers containing about 30% to about 50% formamide at temperatures ranging from about 20° C. to about 50° C. An example of such a partially denaturing buffer which is commercially available is the DIG Easy Hyb™ (Roche) buffer. In embodiments, un-labelled nucleic acids such as transfer RNA (tRNA) and salmon sperm DNA may be added to the hybridization buffers to reduce background noise. Under certain conditions, a divergence of 15% over long fragments (greater than 50 bases) can be reliably detected. Single nucleotide mistmatches in shorter fragments (15 to 25 nucleotides in length) can be also detected if the hybridization conditions are designed accordingly. Hybridization time typically ranges from about one hour to overnight (16 to 18 hours approximately). After hybridization, microarrays are typically washed one to five times in buffered salt solutions such as saline-sodium citrate, abbreviated SSC, for periods of time and at salt concentrations and temperature appropriate for a particular objective. A representative procedure may for example comprise three washes in pre-warmed (50° C.) 0.1×SSC (1×SSC contains 150 mM NaCl and 15 mM trisodium citrate, pH 7). In embodiments, a detergent such as sodium dodecyl sulfate (SDS; e.g. at 0.1%) may be added to the washing buffer. Various details of hybridization conditions, some of which are described herein, are known in the art.

[0046] Hybridization may be performed under absolute or differential formats. The former refers to hybridization of nucleic acids from one sample to an array, and the detection of the nucleic acids thus hybridized. The differential hybridization format refers to the application of two samples, labelled with different labels (e.g. Cy3 and Cy5 fluorophores), to the array. In this case differences and similarities between the two samples may be assessed.

[0047] Many steps in the use of the DNA chip can be automated through use of commercially available automated fluid handling systems. For instance, the chip can be manipulated by a robotic device which has been programmed to set appropriate reaction conditions, such as temperature, add reagents to the chip, incubate the chip for an appropriate time, remove unreacted material, wash the chip substrate, add reaction substrates as appropriate and perform detection assays. If desired, the ,chip can be appropriately packaged for use in an automated chip reader.

[0048] The target polynucleotide whose sequence is to be determined is usually labelled at one or more nucleotides with a detectable label (e.g. detectable by spectroscopic, photochemical, biochemical, chemical, bioelectronic, immunochemical, electrical or optical means). The detectable label may be, for instance, a luminescent label. Useful luminescent labels include fluorescent labels, chemi-luminescent labels, bio-luminescent labels, and colorimetric labels, among others. Most preferably, the label is a fluorescent label such as a cyanine, a fluorescein, a rhodamine, a polymethine dye derivative, a phosphor, and so forth. Suitable fluorescent labels are described in for example Haugland, Richard P., 2002 (Handbook of Fluorescent Probes and Research Products, ninth edition, Molecular Probes). The label may be a light scattering label, such as a metal colloid of gold, selenium or titanium oxide. Radioactive labels such as ³²P ³³P or ³⁵S can also be used.

[0049] When the target strand is prepared in single-stranded form, the sense of the strand should be complementary to that of the probes on the chip. In an embodiment, the target is fragmented before application to the chip to reduce or eliminate the formation of secondary structures in the target. Fragmentation may be effected by mechanical, chemical or enzymatic means. The average size of target segments following fragmentation is usually larger than the size of probe on the chip.

[0050] In embodiments, the target or sample nucleic acid may be extracted from a sample or otherwise enriched prior to application to or contacting with the array. Samples may amplified by suitable methods, such as by culturing a sample in suitable media (e.g. LB) under suitable culture conditions to effect growth of microorganism(s) in the sample. Extraction may be performed using methods known in the art (see for example Sambrook et al. et al. [1989] Molecular Cloning: A Laboratory Manual.), including various treatments such as lysis (e.g. using lysozyme), heating, detergent (e.g. SDS) treatment, solvent (e.g. phenol-chloroform.) extraction, and precipitation/resuspension. In an embodiment, the nucleic acid is not amplified using polymerase chain reaction (PCR) methods prior to application to the array.

[0051] In an embodiment, the probes may be provided, for example as a suitable solution, and applied to different, discrete regions of the substrate. Such methods are sometimes referred to as “printing” or “pinning”, by virtue of the types of apparatus and methods used to apply the probe samples to the substrate. Suitable methods are described in for example U.S. Pat. No. 6,110,426 to Shalon et al. The probe samples may be prepared by a variety of methods, including but not limited to oligonucleotide synthesis, as a PCR product using specific primers, or as a fragment obtained by restriction endonuclease digestion of a nucleic acid sample. Interaction/binding of the probe to the substrate may be enforced by non-covalent interactions and covalent attachment, for example via charge-mediated interactions as well as attachment to the substrate via specific reactive groups, crosslinking and/or heating.

[0052] In an embodiment, the arrays may be produced by, for example, spatially directed oligonucleotide synthesis. Methods for spatially directed oligonucleotide synthesis include, without limitation, light-directed oligonucleotide synthesis, microlithography, application by ink jet, microchannel deposition to specific locations and sequestration with physical barriers. In general these methods involve generating active sites, usually by removing protective groups; and coupling to the active site a nucleotide which, itself, optionally has a protected active site if further nucleotide coupling is desired.

[0053] In embodiments, the probes can be bound to the substrate through a suitable linker group. Such groups may provide additional exposure to the probe. Such linkers are adapted to comprise a terminal portion capable of interacting or reacting with the substrate or groups attached thereto, and another terminal portion adapted to bind/attach to the probe molecule.

[0054] Samples of interest, e.g. samples suspected of comprising a microorganism, for analysis using the products and methods of the invention include for example environmental samples, biological samples and food. “Environmental sample” as used herein refers to any medium, material or surface of interest (e.g. water, air, soil). “Biological sample” as used herein refers to a sample obtained from an organism, including tissue, cells or fluid. Biological excretions and secretions (e.g. feces, urine, discharge) are also included within this definition. Such biological samples may be derived from a patient, such as an animal (e.g. vertebrate animal, humans, domestic animals, veterinary animals and animals typically used in research models). Biological samples may further include various biological cultures and solutions.

[0055] The probes utilized herein may in embodiments comprise a nucleotide sequence identical to a nucleic acid derived from a microorganism or substantially identical or homologous to such a nucleic acid. “Homology” and “homologous” refers to sequence similarity between two peptides or two nucleic acid molecules. Homology can be determined by comparing each position in the aligned sequences. A degree of homology between nucleic acid or between amino acid sequences is a function of the number of identical or matching nucleotides or amino acids at positions shared by the sequences. As the term is used herein, a nucleic acid sequence is “homologous” to another sequence if the two sequences are substantially identical and the functional activity of the sequences is conserved (as used herein, the term ‘homologous’ does not infer evolutionary relatedness). Two nucleic acid sequences are considered “substantially identical” if, when optimally aligned (with gaps permitted), they share at least about 50% sequence similarity or identity, or if the sequences share defined functional motifs. In alternative embodiments, sequence similarity in optimally aligned substantially identical sequences may be at least 60%, 70%, 75%, 80%, 85%, 90% or 95%. As used herein, a given percentage of homology between sequences denotes the degree of sequence identity in optimally aligned sequences. An “unrelated” or “non-homologous” sequence shares less than 40% identity, though preferably less than about 25% identity, with a sequence of interest.

[0056] Substantially complementary nucleic acids are nucleic acids in which the “complement” of one molecule is substantially identical to the other molecule. Optimal alignment of sequences for comparisons of identity may be conducted using a variety of algorithms, such as the local homology algorithm of Smith and Waterman, 1981, Adv. Appl. Math 2: 482, the homology alignment algorithm of Needleman and Wunsch, 1970, J. Mol. Biol. 48:443, the search for similarity method of Pearson and Lipman, 1988, Proc. Natl. Acad. Sci. USA 85: 2444, and the computerised implementations of these algorithms (such as GAP, BESTFIT, FASTA and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, Madison, Wis., U.S.A.). Sequence identity may also be determined using the BLAST algorithm, described in Altschul et al., 1990, J. Mol. Biol. 215:403-10 (using the published default settings). Software for performing BLAST analysis may be available through the National Center for Biotechnology Information (through the internet at http://www.ncbi.nlm.nih.gov/). The BLAST algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence that either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighbourhood word score threshold. Initial neighbourhood word hits act as seeds for initiating searches to find longer HSPs. The word hits are extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Extension of the word hits in each direction is halted when the following parameters are met: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T and X determine the sensitivity and speed of the alignment. The BLAST program may use as defaults a word length (W) of 11, the BLOSUM62 scoring matrix (Henikoff and Henikoff, 1992, Proc. Natl. Acad. Sci. USA 89: 10915-10919) alignments (B) of 50, expectation (E) of 10 (or 1 or 0.1 or 0.01 or 0.001 or 0.0001), M=5, N=4, and a comparison of both strands. One measure of the statistical similarity between two sequences using the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. In alternative embodiments of the invention, nucleotide or amino acid sequences are considered substantially identical if the smallest sum probability in a comparison of the test sequences is less than about 1, preferably less than about 0.1, more preferably less than about 0.01, and most preferably less than about 0.001.

[0057] An alternative indication that two nucleic acid sequences are substantially complementary is that the two sequences hybridize to each other under moderately stringent, or preferably stringent, conditions. Hybridization to filter-bound sequences under moderately stringent conditions may, for example, be performed in 0.5 M NaHPO₄, 7% sodium dodecyl sulfate (SDS), 1 mM EDTA at 65° C., and washing in 0.2 ×SSC/0.1% SDS at 42° C. (see Ausubel, et al. (eds), 1989, Current Protocols in Molecular Biology, Vol. 1, Green Publishing Associates, Inc., and John Wiley & Sons, Inc., New York, at p. 2.10.3). Alternatively, hybridization to filter-bound sequences under stringent conditions may, for example, be performed in 0.5 M NaHPO₄, 7% SDS, 1 mM EDTA at 65° C., and washing in 0.1×SSC/0.1% SDS at 68° C. (see Ausubel, et al. (eds), 1989, supra). Hybridization conditions may be modified in accordance with known methods depending on the sequence of interest (see Tijssen, 1993, Laboratory Techniques in Biochemistry and Molecular Biology—Hybridization with Nucleic Acid Probes, Part I, Chapter 2 “Overview of principles of hybridization and the strategy of nucleic acid probe assays”, Elsevier, N.Y.). Generally, stringent conditions are selected to be about 5° C. lower than the thermal melting point for the specific sequence at a defined ionic strength and pH.

[0058] Although various embodiments of the invention are disclosed herein, many adaptations and modifications may be made within the scope of the invention in accordance with the common general knowledge of those skilled in this art. Such modifications include the substitution of known equivalents for any aspect of the invention in order to achieve the same result in substantially the same way. Numeric ranges are inclusive of the numbers defining the range. In the claims, the word “comprising” is used as an open-ended term, substantially equivalent to the phrase “including, but not limited to”. The following examples are illustrative of various aspects of the invention, and do not limit the broad aspects of the invention as disclosed herein.

EXAMPLES Example 1 Materials and Methods

[0059] Strains and Media.

[0060]E. coli strains used to produce PCR templates are listed in Table 1. E. coli isolates including characterized strains (the non-pathogenic K12-derived E. coli strain DH5α, the enterohemorrhagic strain EDL933, the uropathogenic strain J96, the enterotoxigenic strain H-10407 and the enteropathogenic strains E2348/69 and P86-1390) and uncharacterized clinical strains from bovine (B00-4830, B99-4297), avian (Av01-4156), canine (Ca01-E179) and human (H87-5406) origin were used to assess the detection thresholds and hybridization specificity of the virulence microarray. Most of the E. coli strains were obtained from the Escherichia coli laboratory collection at the Faculté de médecine vétérinaire of the Université de Montréal. E. coli strains A22, AL851, C248 (21) were kindly provided by Carl Marrs (University of Michigan) and IA2 by J. R. Johnson (University of Minnesota) respectively. All strains were stored in Luria-Bertani broth (LB [6]) broth plus 25% glycerol at −80° C. E. coli cultures were grown at 37° C. in LB broth for genomic DNA extraction and purification.

[0061] Selection and Sequence Analysis of Virulence Gene Probes.

[0062] The selection included virulence genes of E. coli pathotypes involved in intestinal and extra-intestinal diseases in humans and animals (see Table 1). The primers used for probe amplification were either chosen from previous studies on virulence gene detection or designed from available gene sequences (see Table 2). 103 E. coli virulence genes were targeted in this study, encoding (a) toxins (heat-labile toxin LT, human heat-stable toxin STaH, porcine heat-stable toxin STaP, Shiga-toxins Stx1 and Stx2, haemolysins Hly and Ehx, East1, STb, EspA, EspB, EspC, cytolethal distending toxin Cdt, cytotoxic necrosing factor Cnf, Cva, Leo) (b) adhesion factors (Cfa, Iha, Pap, Sfa, Tir, Bfp, Eaf, Eae, Agg, Lng, Aida, Foc, Afa, Nfa, Drb, Fim, Bma, ClpG, F4, F5, F6, F17, F18, F41) (c) secretion systems (Etp) (d) capsule antigens (KfiB, KpsMTII, KpsMTIII, Neu) (e) somatic antigens (RfcO4, RfbO9, RfbO101, RfbO111, RfbEO157) (f) flagellar antigen (FliC), (g) invasins (IbeA, IpaC, InvX), (h) autotransporters (Tsh), (i) aerobactin system (IucD, TraT, IutA) and, in addition, to espP (serine-protease), katP (catalase), omp (outer membrane proteins A and T), iroN (catechol siderophore receptor), iss (serum survival gene), putative RTX family exoprotein (rtx) and paa (related attaching and effacing gene) probes. The Yersinia high-pathogenicity island (irp1, irp2, and fyuA) present in different E. coli pathotypes and other Enterobacteriaceae was also targeted (3). An E. coli positive control gene, uidA, which encodes the E. coli-specific β-glucuronidase protein (17, 31) and the uspA gene which encodes a uropathogenic-specific protein (60) were added to this collection.

[0063] The DNA sequence of each gene was analyzed by BLAST analysis and ClustalW alignment followed by phylogenetic analysis. When the selected gene showed sequence divergence over 10% amongst different strains, new primers were designed to amplify the probe from each phylogenetic group as was the case for espA, espB and tir genes. The new primers were selected in conserved sequence areas flanking the area of divergence in order to ensure gene discrimination at the hybridization level. Phylogenetic analysis of the attaching and effacing locus (LEE) genes espA, espB and tir permitted us to distinguish three phylogenetic groups with regard to the sequence divergence cutoff value (<10%) ,chosen for this study. Attaching and effacing genes from strains EDL933, E2348/69 and RDEC-1 belonging to the different phylogenetic groups have been cloned and sequenced (27, 76, 92). Genomic DNA from strains EDL933 (EHEC), E2348/69 (Human EPEC) and RDEC-1 (rabbit EPEC) were used as templates to PCR amplify the different probes espA2-espB1-tir2, espA3-espB2-tir3 and espA1-espB3-tir1 respectively. The amplified probes were sequenced to confirm their identity and printed onto the pathotype microarray as shown in FIG. 1. For some virulence determinants, several genes of the cluster were targeted such as hly (hlyA, hlyC), pap (papAH, papEF, papC, papG), sfa (sfaDE, sfaA), agg (aggA, aggC). Utilization of several genes per cluster assisted in the confirmation of positive signals in addition to the assessment of cluster integrity. DNA probes detecting the genetic variants of Shiga-toxins (stx1, stx2, stxA1, stxA2, stxB1 and stxB2), cytolethal distending toxin (cdt1, cdt2 and cdt3), cytotoxic necrosing factor (cnf1, cnf2), and papG alleles (papGI, papGII and papGIII) were also included. In total, this gene sequence analysis resulted in the selection of 105 gene probes (Table 1). Probe amplification, purification and sequencing.

[0064]E. coli strains were grown overnight at 37° C. in Luria-Bertani medium. A 200 μl sample of the culture was centrifuged, the pellet was washed and resuspended in 200 μl of distilled water. The suspension was boiled 10 min and centrifuged. A 5 μl aliquot of the supernatant was used as a template for PCR amplification. PCR reactions were carried out in a total volume of 100 μl containing 50 pmol of each primer, 25 μmol of dNTP, 5 μl of template, 10 μl of 10× Taq buffer (500 mM KCl, 15 mM MgCl₂, 100 mM Tris-HCl, pH 9) and 2.5 U of Taq polymerase (Amersham-Pharmacia). PCR products were analyzed by electrophoresis on 1% agarose gels in TAE (40 mM Tris-acetate, 2 mM Na₂EDTA), then purified with the Qiaquick™ PCR Purification Kit (Qiagen, Mississauga, Ontario) and eluted in distilled water. Since the annealing temperature of the various PCR primers ranged from 40° to 65° C. and genomic DNA from 36 E. coli strains were used as template, all the PCR amplifications were done separately. A total of 103 virulence factor probes and two positive control probes, uidA and uspA, were amplified successfully as determined by amplicon size and DNA sequence. The purity of the amplified DNA was confirmed by agarose gel electrophoresis of 50-100 ng of each amplified fragment. The size of the PCP products ranged from 117 bp (east1) to 2121 bp (katP) with an average length of 500 bp for the majority of the DNA probes (Table 1). For quality control purposes all PCR fragments were partially sequenced for gene verification (Applied Biosystem 377 DNA sequencer using the dRhodamine Terminator Cycle Sequencing Ready™ reaction Kit).

[0065] Genomic DNA Extraction and Labeling.

[0066] Cells, collected by centrifuging 5 ml of an overnight culture, were washed with 4 ml of solution 1 (0.5 M NaCl, 0.01 M EDTA pH 8), resuspended in 1.2 ml of buffer 2 (solution 1 containing 1 mg/ml of lysozyme), then incubated at room temperature for 30 min. After SDS addition and phenol-chloroform extraction, total DNA was precipitated by adding one volume of isopropanol. The harvested pellet was washed with one volume of 70% ethanol, dried then resuspended in 100 μl of Tris-EDTA buffer. Before labeling, tot al DNA was reduced in size by restriction enzyme digestion (New England BioLabs, Mississauga, Ontario) and following digestion, the enzymes removed by phenol-chloroform extraction. Cy 3 dye was covalently attached to DNA using a commercial chemical labeling method (Mirus' Label IT™, PANVERA) with the extent of labeling depending primarily on the ratio of reagent to DNA and the reaction time. These parameters were varied to generate labeled DNA of different intensity. Two μg of the digested DNA were chemically labeled using 4 μl of Label IT™ reagent, 3 μl of 10× Mirus™ labeling buffer A and distilled water in a 30 μl total volume. The reactions were carried out at 37° C. for 3 h. Labeled DNA was then separated from free dye by washing four times with water and centrifugation through Microcon™ YM-30 filters (Millipore, Bedford, USA). The amount of incorporated fluorescent cyanine dye was quantified by scanning the probe from 200 nm to 700 nm and subsequently inputting the data into the % incorporation calculator found at http://www.pangloss.com/seidel/Protocols/percent inc.html. This method is based on the calculation of the ratio of μg of incorporated fluorescence: μg of labeled DNA.

[0067] Optimization of Microarray Detection Threshold Using a Prototype Microarray

[0068] A prototype chip was constructed and used to assess parameters, namely fragment length and extent of fluorescent labeling of the target (test) DNA, to optimize the spot detection threshold of the microarray. DNA amplicons from 34 E. coli virulence genes including the following EHEC virulence gene probes: espP (13), EHEC-hlyA (80), stx1 (84), stx2 (91), stxc (91), stxall (46), paa (1) and eae (4), were generated by PCR amplification and printed in triplicate. The probe lengths ranged from 125 bp (east1) to 1280 bp (irp1). A HindIII/EcoRI digestion was used to generate large fragments (average size ˜6 Kb) and Sau3A/AluI digestion to produce smaller DNA fragments (average size ˜0.2 Kb) from E. coli O157:H7 strain STJ348 genomic DNA. The restricted DNAs were labeled and used as the target for hybridization with the prototype microarray. In our experiments, the strongest hybridization signal was obtained by using larger fragments labeled at an optimal Cy3 rate in the range of 7.5 to 12.5. An estimate of the microarray's sensitivity was calculated by the following equation as described by De Boer and Beumer (24):

Sensitivity (%)=(number of true positive spots (p)/p+number of false negative spots)×100

[0069] Construction of the E. coli Pathotype Microarray

[0070] Virulence factor probes were grouped by pathotype with the resulting array being composed of eight subarrays each corresponding to well characterized E. coli categories (FIG. 1). The enterohemorrhagic (EHEC) subarray included Shiga-toxin gene probes (stx1, stx2, stxA1, stxA2, stxB1, stxB2 and stxB3), attaching and effacing genes, (espA, espB, tir, eae, and paa), EHEC specific pO157 plasmid genes (etpD, ehxA, L9075, katP, espP) and O157 and O111 somatic antigen genes (rEbEO157 and rfbO111). Enteropathogenic E. coli (EPEC) was targeted by spotting LEE specific gene probes (eae, tir, espA, espB), espC and EPEC EAF plasmid probes (bfpA, eaf). The enterotoxigenic subarray (ETEC) included probes for human heat-stable toxin (STaH), porcine heat-stable toxin (STaP), heat-stable toxin type II (STb), heat-labile toxin (LT), adhesion factors shared by human ETEC (CFAI, CS1, CS3, LngA) or by animal ETEC (F4, F5, F6, F18, F41). DNA probes for O101 specific somatic antigen (rfbO101) and ETEC toxin (leoA) were also included. To identify uropathogenic strains, the UPEC subarray was composed of 27 probes selected for detection of extraintestinal E. coli adhesins Pap (papGI, papGII, papGIII, papAH, papEF, papC), Sfa (sfaA, sfaDE), Drb (drb122), Afa (afa3, afa5, afaE7, afaD8), F1C (focG), nonfimbrial adhesin-1 (nfaE), M-agglutinin subunit (bmaE), CS31A (clpG), toxins including hemolysins (hlyA and hlyC), cytotoxic necrosing factor (cnf1), and colicin V (cvaC), aerobactin receptor (iutA), capsular specific genes kfiB (K5), kpsMTII (K1, K5, K12), KpsMTIII (K10, K54) in addition to the surface exclusion gene (traT) and uspA probes. The cell-detaching subarray (CDEC) contained toxin probes cnf1, cnf2, cdt1, cdt2 and cdt3. The genes iucD, neuC, ibe10, rfbO9 and rfcO4 were designed to represent the meningitis-associated E. coli pathotype (MENEC). Enteroaggregative E. coli probes (EAEC) were derived from fimbrial specific genes aggA and aggC whereas enteroinvasive pathotype (EIEC) was targeted by invasin gene probes ipaC and invX. The AIDA (adhesin involved in diffuse adherence) probe was the unique marker for the diffusely adherent pathotype (DAEC).

[0071] Some virulence genes, such as fimA, fimH, irp1, irp2, iss, fyuA, ompA, east1, iha, fliC, tsh and ompT are shared by several E. coli pathotypes, and are thus indicative of subsets of pathotypes rather than specific to any one pathotype in particular. Finally a positive control, the uidA gene probe (17, 31) as well as a negative control composed of 50% DMSO solution were added. An estimate of the specificity of the virulence microarray was calculated by the following equation (24):

Specificity (%)=(number of true negative spots (n)/n+number of false positive spots)×100

[0072] Printing and Processing of the Microarrays.

[0073] Two μg of each DNA amplicon were lyophilized in a speed-vacuum and resuspended in filtered (0.22 μm) 50% DMSO. The concentration of amplified products was adjusted to 200 ng/μl and 10 μl of each DNA amplicon was transferred to a 384-well microplate and stored at −20° C. until the printing step. DNA was then spotted onto CMT-GAPS™ slides (Corning Co., Corning, N.Y.) using a VIRTEK ChipWriter™ with Telechem SMP3™ microspotting pins. Each DNA probe was printed in triplicate on the microarray. After printing, the arrays were subjected to ultraviolet crosslinking at 1200 μJoules (U.V. Stratalinker™ 1800, STRATAGEN) followed by heating at 80° C. for four hours. Slides were then stored in the dark at room temperature until use.

[0074] Microarray Hybridization and Analysis.

[0075] Microarrays were prehybridized at 42° C. for one hour under a 22×22 mm coverslip (SIGMA) in 20 μl of pre-warmed solution A (DIG Easy Hyb™ buffer, Roche, containing 10 μg of tRNA and 10 μg of denatured salmon sperm DNA). After the coverslip was removed by dipping the slide in 0.1×SSC (1×SSC contained 150 mM NaCl and 15 mM trisodium citrate, pH 7), the array was rinsed briefly in water and dried by centrifugation at room temperature in 50 ml conical tubes for five min at 800 rpm. Fluorescently-labeled DNA was chemically denatured as described by the manufacturer and added to 20 μl of a fresh solution of pre-warmed solution A. Hybridization was carried out overnight at 42° C. as recommended by the manufacturer. After hybridization, the coverslip was then removed in 0.1×SSC and the microarray washed three times in pre-warmed 0.1×SSC/0.1% SDS solution and once in 0.1×SSC for 10 min at 50° C. After drying by centrifugation (800 rpm, five min, room temperature), the array was analyzed using a fluorescent scanner (Canberra-Packard, Mississauga, Ontario). The slides were scanned at a resolution of 5 μm at 85% laser power and the fluorescence quantified after background subtraction using QuantArray™ software (Canberra-Packard). All hybridization experiments were replicated between two to five times per genome.

Example 2 Assessment of the Pathotype Microarray for Virulence Pattern Analysis

[0076] To identify known virulence genes and consequently, the pathotype of the E. coli strain being examined, genomic DNA from several previously characterized E. coli strains was labeled and hybridized to the pathotype microarray. The K12-derived E. coli strain DH5α was included as a nonpathogenic control. Interestingly, E. coli DH5α produced a fluorescent hybridization signal with the uidA, fimA₁, fimA₂, fimH, ompA, ompT, traT, fliC and iss probes (FIG. 2A). Genbank analysis of the sequenced K12 strain MG1655 genome revealed the presence of the first seven genes whereas the iss probe is 90% similar to ybcU, a gene encoding a bacteriophage lambda Bor protein homolog (sequence K12) . Surprisingly, a false positive signal was obtained with the cdt1 and aggA gene probes. These genes are absent in the E. coli K12 genome and their sequences are not homologous to any K12 genes. Moreover, these genes were not positive with K12 or O157:H7 strain EDL933 in earlier generations of the virulence chip. We postulated that the signal may have been the result of amplicon contamination in the final printing. Therefore, these two probes were not included in all subsequent hybridization analyses.

[0077] Since the genomic sequence of E. coli O157:H7 strain EDL933 is available on GENBANK (NC_(—)002655), this strain represented a good choice to assess the detection threshold and hybridization specificity of the E. coli virulence factors on the microarray. After hybridizing the pathotype microarray with Cy3-labeled genomic DNA from E. coli O157:H7, the scanned image (FIG. 2B) showed fluorescent signals with the EHEC specific genes encoding Shiga-toxins, the attaching and effacing cluster present in EHEC and EPEC E. coli, the genes carried on the EHEC pO157 plasmid, antigen and flagellar specific genes as well as iha, an adhesin encoding gene (AF401752) found in both the EHEC and UPEC pathotypes. Therefore the EHEC pathotype of E. coli 0157:H7 was easily confirmed by a rapid visual scan of the virulence gene pattern (FIG. 1) of the scanned image.

[0078] The UPEC strain J96 (04:K6) is a prototype E. coli strain from which various extraintestinal E. coli virulence factors have been cloned and characterized (73, 86). This strain possesses two copies of the gene clusters encoding P (pap-encoded) and P-related (prs-encoded) fimbriae, produces F1C (focG), contains two hly gene clusters encoding hemolysin and produces cytotoxic necrosing factor type 1 (cnf1). E. coli strain J96 DNA was labeled and hybridized to the pathotype microarray. The scanned array resulted in a UPEC pathotype hybridization pattern (FIG. 2C) . All of the UPEC virulence genes cited above were detected, as well as other uropathogenic specific genes. From a taxonomic perspective, the microarray also permitted the detection of the 04 antigen gene (rfcO4).

[0079] An enterotoxin-producing strain of E. coli isolated from a case of cholera-like diarrhea, E. coli strain H-10407 (30), was used as a control strain to assess the ability of the microarray to identify the ETEC pathotype (FIG. 2D). Hybridization results showed the presence of a heat-stable enterotoxin Stah, antigenic surface-associated colonization factor cfaI, heat-labile enterotoxin LT, east1 toxin, and a weak signal was obtained with stap probe. The hybridization pattern correlated well with the virulence profile and pathotype group of this strain (28, 29, 68).

Example 3 Determination of Virulence Patterns of Uncharacterized Clinical E. coli Strains

[0080] To further validate the pathotype chip, virulence gene detection was assessed by hybridization with genomic DNA from five clinical E. coli strains isolated from human (H87-5406) and animal (Av01-4156, B00-4830, Ca01-E179, B99-4297) sources. Genomic DNAs from these strains were fragmented and Cy3-labeled and the microarray hybridization patterns obtained were compared with PCR amplification results.

[0081] The virulence gene pattern obtained after microarray hybridization analysis with Cy3-labeled E. coli genomic DNA of avian-origin (Av01-4156) showed the presence of the extra-intestinal E. coli virulence genes (iucD, iroN, traT, iutA) and genes present in our K12 strain (fimA₁, fimA₂, fimH, iss, ompA, and ompT) (FIG. 3A). The temperature-sensitive hemagglutinin gene (tsh) that was often located on the ColV virulence plasmid in avian-pathogenic E. coli (APEC) (26) was also detected on the Av01-4156 virulence gene array. A strong hybridization signal was also obtained with the rtx probe derived from a gene located on the O157:H7 chromosome and encoding a putative RTX family exoprotein. The overall virulence factor detection pattern indicates that this strain is involved in extraintestinal infections.

[0082] When the pathotype microarray was hybridized with genomic DNA from strain B00-4830 isolated from bovine ileum, genes encoding ETEC fimbriae F5 and heat stable toxin StaP were detected (FIG. 3B) indicating that this strain belongs to animal ETEC pathotype. The hybridization pattern also showed the presence of traT, ompA, fimA₁, fimA₂, fimH, fliC genes and the EHEC-associated gene etpD.

[0083] The virulence pattern obtained after microarray hybridization analysis with Cy3-labeled human-origin E. coli genomic DNA H87-5406 strain was very complex and did not fall within a single pathotype category. The hybridization pattern revealed the presence of espP, iss, rtx, fimA₁, fimA₂, fimH, ompA, and ompT genes as well as Shiga-toxin gene, stx1, detected in the enterohemorragic pathotype (FIG. 3C). Moreover, virulence genes involved in extra-intestinal infections (cdt2, cdt3, afaD8, bmaE, iucD, iroN, traT and iutA) were also observed. Strain H87-5406 was also positive for the type 2 cytotoxic necrosing factor encoded by cnf2 gene.

[0084] The virulence patterns of two other isolates, the pulmonary isolated strain Ca01-E179 and the bovine strain B99-4297 (used elsewhere in this study) were clearly identified as UPEC pathotype and Shiga-toxin positive E. coli respectively (data not shown). The presence of all the pathotype-specific virulence factors that were positively identified by the microarray data for the above animal and human isolates, was further confirmed by PCR amplification of each positive signal.

Example 4 Discrimination Between Homologous Genes Belonging to Different Subclasses

[0085] Given the importance of the stx gene family, amplicons stxA1 and stxA2 specific for the A subunits of the stx1 and stx2 family (Table 2) were designed, in addition to using the published amplicons stx1 and stx2 (Table 1) which overlap the A and B subunits of the genes. Sequence similarity is of the order of 57% between the published stx1 and stx2 amplicons; similarity between the stxA1 and stxA2 amplicons designed herein is slightly higher, at 61%. As shown in FIG. 4A, the DNA probes used in this study for detection of stx1 and stx2 gene variants were successful in distinguishing stx1 from stx2, using either the previously published amplicons or the stxA subunit probes.

[0086] To further explore the potential of microarrays to distinguish gene variants within homologous gene families, primers used for cnf1and cnf2 probe amplification were derived from studies on the detection of cnf variant genes by PCR amplification. The resulting amplicons have 85% sequence similarity. Hybridization results obtained with genomic DNA from cnf-positive strains H87-5406 and Ca01-E1799 (FIG. 4B) showed a clear distinction on the microarray between cnf1and cnf2 gene variants, a significant result given the high degree of similarity and the size (over 1 kb) of the amplicons used.

[0087] Since the DNA microarray showed initial promise in discriminating between the known gene variants of stx and cnf, a more defined group of genes were selected in order to test the ability of the pathotype microarray to differentiate between different phylogenetic groups of genes with a sequence divergence cutoff value of >10%. The DNA sequence similarity values of espA, espB and tir probes from the three different groups are summarized in FIG. 5A. The microarray was hybridized with labeled genomic DNA from EDL933 (EHEC) and E2348/69 (EPEC1) strains. Labeled DNA from another strain P86-1390 belonging to the same phylogenetic group as RDEC-1 was used to validate the hybridization specificity of the arrayed virulence genes. Hybridizations with the pathotype microarray were performed at 42° C. and 50° C. and, as shown in FIG. 5B, C and D, the labeled DNA hybridized as expected to probes specific for each phylogenetic group. Genomic DNA from strain P86-1390 hybridized with espA1-espB3-tir1 probes, indicating that this strain belongs to the same group as RDEC-1, which correlates well with the phylogenetic analysis. A strong cross-hybridization signal was obtained between the espA1 and espA3 probes due to their high DNA-similarity score (89.6%). These hybridization patterns were obtained at 42° C. as well as at 50° C. suggesting that DNA sequence divergences of 25% can be resolved under standard hybridization conditions. These results demonstrated that the pathotype microarray can be a useful tool for strain genotyping.

[0088] The studies described herein entailed designing a DNA microarray containing 102 gene probes distributed into eight subarrays corresponding to various E. coli pathotypes. To evaluate the microarray regarding the specificity of the amplified virulence factor gene fragments, genomic DNAs from different E. coli strains were labeled and hybridized to the virulence factor microarray. To this end, applicants developed a simple protocol for probe and target preparation, labeling and hybridization. The use of PCR amplification for probe generation, and fragmented genomic DNA as labeled target allowed the detection of all known virulence factors within characterized E. coli strains. Direct chemical labeling of genomic DNA with a single fluorescent dye (Cy3) facilitated the work.

[0089] Since the fluorescent assay used herein was based on direct detection (single Cy dye) rather than differential hybridization (multiple dyes), optimization of the signal detection threshold was performed. It was determined that the signal intensity, apart from DNA homology and DNA labeling efficiency, depended on (i) immobilized amplicon size (ii) gene copy number in target genomic DNA and (iii) size of the labeled target DNA. Within the large range of probe sizes (117 bp and 2121 bp) tested, hybridization signal intensity could be affected by probe length when using homologous DNA. Quality control analysis of the printed microarray using terminal transferase showed heterogeneity in the spotted amplicons. Since this enzyme attaches Cy3 to the 3′ end of the fixed DNA amplicon, we expected that the quality control signal would be stronger with smaller amplicons due to an increased number of free ends. Unexpectedly however, small fragments (less than 200 bp) produced poorer hybridization signals than that of larger amplicons. Using two strains with known genomes (K12 and EDL933), we can estimate the level of accuracy (sensitivity and specificity) of the current virulence chip as outlined in the Examples herein. The average sensitivity or accuracy in discriminating among the different virulence genes approached 97%. These estimates take into account a shared total of three false negatives among the total of 210 (i.e. 2×105) virulence gene spots for both strains.

[0090] Gene location is another factor to consider when designing gene detection microarrays. After hybridization with genomic DNA from E. coli O157:H7 strain EDL933, it was found strong hybridization signals to etpD, ehxA, L7590, katP and espP. Since these genes are located on the pO157 plasmid (Accession number AF074613) (15) the stronger signal can be attributed to a higher copy number or gene dose. Moreover, many virulence genes are located on mobile elements like plasmids, phages, or transposons (69) and are encoded by foreign DNA acquired via horizontal gene transfer and inserted in the genome. These pathogenicity islands (PAIs) are highly unstable and are constantly shuttled between strains. However, in addition to their total horizontal transfer (12, 38) or deletion (10, 11, 40), several studies suggested that PAIs are subject to continuous modifications in their virulence factor composition (52). In earlier work, the detection of a single. PAI gene reflected the presumed presence of all the additional virulence genes encoded by the PAI (59) but due to the potential for genetic rearrangements described above, this assumption is risky. Microarray technology represents an excellent tool to circumvent this PAI plasticity and identify genetic rearrangements by gene deletion or insertion on PAI clusters.

[0091] Recent investigations of E. coli virulence have revealed new information regarding the prevalence of virulence genes within a specific E. coli pathotype. For example the cytolethal-distending factor (cdt) was first described as virulence factor associated with EPEC E. coli and other diarrhea-associated pathotypes (2, 56, 57) . Later, this gene was detected in strains involved in extraintestinal infections in humans and dogs (49-51, 54, 55). More recently, cdt and the urinary tract infection-associated gene (ompT) have been found to be as or more prevalent than traditional neonatal bacterial meningitis NBM-associated traits, such as ibeA, sfaS, and K1 capsule (52). The usefulness of the virulence microarray concept for exploring the global virulence pattern of strains and the potential detection of unexpected virulence genes was revealed by total genomic hybridizations with uncharacterized clinical strains. The rtx probe (encoding a putative RTX family exoprotein, accession number AE005229) located on the O157:H7 chromosome was amplified using genomic DNA from strain EDL933. Blast analysis did not reveal significant similarities with any available sequences. Analysis of the hybridization patterns of the extraintestinal strain Av01-4156 and strain H87-5406 revealed a strong signal with the rtx probe indicating the presence of a gene homologous to the rtx probe (FIG. 3). This gene was successfully amplified in both strains using the rtx-specific primers. To our knowledge, this is the first report of the presence of this gene in non-O157 strains.

[0092] The potential for possessing different combinations or sets of virulence genes within a given E. coli strain could lead to the emergence of new pathotypes. Consistent with this hypothesis, it was found that in the clinical strain H87-5406, a combination of virulence factors from different pathotypes was observed. Moreover, microarray hybridization permitted detection of the Shiga-toxin gene stx1 associated with EHEC strains in addition to virulence genes involved in extra-intestinal infections (cdt2, cdt3, afaD8, bmaE, iucD, iroN, traT, iutA). Starcic et al. (83) recently reported a case of a “bifunctional ” E. coli strain isolated from dogs with diarrhea. When analyzed, only a few strains were positive for heat stable toxin (ST) and none of them produced diarrhea-associated fimbriae K88 or K99 in contrast with previous studies (85). However, most of these strains were positive for cytonecrosing toxin (cnf1) as well as P-fimbriae and hemolysin (hly) that are involved in extra-intestinal infections in humans and animals. It was thus concluded that hemolytic E. coli isolated from dogs with diarrhea have characteristics of both uropathogenic and necrotoxigenic strains.

[0093] Another example illustrating the ability of the virulence microarray to provide a more thorough analysis of virulence genes and consequently the detection of potentially new pathotypes is further supported by the present study in which the ETEC pathotype of the bovine clinical strain B00-4830 was confirmed. In addition to the presence of the ETEC-associated virulence genes encoding StaP and F5 revealed in the hybridization pattern, the etpD gene, described by Schmidt et al. (82) as an EHEC type II secretion pathway, was unexpectedly found to be present. In their study, Schmidt et al. (82), reported that the etp gene cluster was detected in all 30 of the EHEC strains tested by hybridization (using the 11.9 Kb etp cluster from EDL933 as a probe) and by PCR using etpD-specific primers. However, none of the other E. coli pathotypes tested (EPEC, EAEC, EIEC, and ETEC) were positive for the etp gene cluster. As our results are contrary to this study, we assayed for the presence of the etpD gene in strain B00-4830 by PCR using the reverse primer described by Schmidt et al (82) and a forward one designed in our study. Amplification of the expected 509 bp fragment was consistent with the microarray results confirming that that etpD gene can be found in ETEC strains.

[0094] Another unexpected finding of the study described herein was the prevalence of fimH and ompT genes that have been epidemiologically associated with extraintestinal infections (51, 54). BLAST analysis of ompT and fimH genes indicated the presence of both genes in E. coli K12 strain MG1655 and in enterohemorrhagic E. coli O157:H7 strain EDL933 and strain RIMD 0509952. In addition, the hybridization results herein revealed the presence of the fimH gene in all strains tested in this study, including non-pathogenic E. coli, EPEC, ETEC and UPEC strains. The ompT gene was less prevalent but present in the Shiga-toxin producing strain H87-5406. It was also found in another Shiga-toxin producing strain B99-4297 as well as in the EPEC strains P86-1390 and E2348/69. The use of these genes as indicators of the UPEC pathotype should be reconsidered.

[0095] The studies described herein thus demonstrate that DNA microarray technology can be a valuable tool for pathotype identification and assessing the virulence potential of E. coli strains including the emergence of new pathotypes. The DNA chip design described herein should facilitate epidemiological and phylogenetic studies since the prevalence of each virulence gene can be determined for different pathotypes (and strains) and the phylogenetic associations elucidated between virulence pattern and serotypes of a given strain. In addition, unlike traditional hybridization formats, microchip technology is compatible with the increasing number of newly recognized virulence genes since thousands of individual probes can be immobilized on one glass slide.

[0096] The DNA labeling methodology, hybridization and pathotype assessment described herein is both rapid and sensitive. The applications of such microarrays extend broadly from the medical field to drinking water, food quality control and environmental research, and can easily be expanded to virulence gene detection in a variety of pathogenic microorganisms. TABLE 1 Genes targeted, primers sources and strains used as PCR amplification templates in this study. SEQ Reference Accession Size ID of Gene number (bp) NO: Strains primers afaBC3 X76688 793 1 A22 (22, 62) afaE5 X91748 470 2 AL 851 This study afaE7 AF072901 618 3 262-KH 89 (61) afad8 AF072900 351 4 2787 (61) agga U12894 432 5 Strain 17.2 (78) aggc U12894 528 6 Strain 17.2 (78) aida X65022 644 7 2787 (5) bfpa U27184 324 8 O126: H6 E2348/69 (39) bmae M15677 505 9 215 (54) cdt1 U03293 412 10 O15: KRVC383 This OvinS5 study cdt2 U042208 556 11 O15: KRVC383 This OvinS5 study cdt3 U89305 556 12 O15: KRVC383 This OvinS5 study cfai S73191 479 13 H-10407 cfaI This study clpg M55389 403 14 215 (7) cnf1 X70670 1112 15 J96 O4: K12 (74) cnf2 U01097 1240 16 O15: KRVC383 (74) OvinS5 cs1 M58550 321 17 PB-176P cfa−II This study cs3 M35657 401 18 PB-176 cfa+ II This study cs31a M59905 710 19 31a (37) CvaC X57525 680 20 1195 (54) derb122 U87541 260 21 O4: K12 J96 This study eae U66102 791 22 O157: H7 STJ348 (4) eaf X76137 397 23 O126: H6 E2348/69 (32) east1 L11241 117 24 O149: K9 1P97- (90) 2554B ehxa AF043471 158 25 O157: H7 STJ348 (31) espa AF064683 478 26 P86-1390 This group I study espA AF071034 523 27 O157: H7 EDL933 This group study II espA AJ225016 481 28 O126: H6 E2348/69 This group study III espB AF071034 502 29 O157: H7 EDL933 This group I study espB Z21555 377 30 O126 H6 E2348/69 This group study II espB X99670 395 31 P86-1390 This group study III espC AF297061 500 32 O126 H6 This E2348/69 study espP AF074613 1830 33 215 (13) etpD Y09824 509 34 O157: H7 EDL933 (82), this study F17A AF022140 441 35 O15: KRVC383 (20) OvinS5 F17G L33969 950 36 O15: KRVC383 (54) OvinS5 F18 M61713 510 37 O139: K82 P88-1199 (45) F4 M29374 601 38 O149: K91 P97- (72) 2554B F41 X14354 431 39 O9: K30 B44s (72) F5 M35282 450 40 O9: K30 B44s (72) F6 M35257 566 41 O9: K-P81-603A (72) fimA Z37500 331 42 3292 (65) group I fimA Z37500 331 42 O157: H7 EDL933 (65) group II fimH AJ225176 508 43 O157: H7 EDL933 (54) fliC U47614 625 44 O157: H7 E32511 (34) focG S68237 359 45 O4: K12 J96 (54) fyuA Z38064 207 46 1195 (3) hlyA M10133 500 47 O4: K12 J96 (89) hlyC M10133 556 48 O4: K12 J96 (9) ibe10 AF289032 170 49 O18 H87-5480 (44, 54) iha AF126104 827 50 O157: H7 E32511 (53) invX L18946 258 51 H84 (EIEC) This study ipaC X60777 500 52 O157: H7 E32511 This study iroN AF135597 668 53 CP9 (53, 79) irp1 AF091251 1689 54 1195 (3) irp2 L18881 1241 55 1195 (3) iss X52665 607 56 3292 This study iucD M18968 778 57 4787 (42) iutA X05874 300 58 4787 (47) katP X89017 2125 59 O157: H7 EDL933 (14) kfiB X77617 501 60 K5(F9) 3669 This study KpsMTI I X53819 270 61 K5(F9) 3669 (54) KpsMTI AF007777 390 62 215 (54) II 17095 AF074613 659 63 O157: H7 EDL933 (15, 64) leoA AF170971 501 64 O149: K91 P97- This 2554B study lngA AF004308 424 65 PB-176P cfa-II This study lt J01646 275 66 O149: K91 P97- (23, 33) 2554B neuC M84026 500 67 O2: K1 U9/41 This study nfaE S61970 537 68 31a (54) ompA V00307 1422 69 O4: K12 J96 (77) ompT X06903 559 70 O4: K12 J96 (51) paa U82533 360 71 O157: H7 STJ348 This study papAH X61239 721 72 O4: K12 J96 (54) papC X61239 318 73 4787 (62) papEF X61239 336 74 O4: K12 J96 (88) PapG M20146 461 75 O4: K12 J96 (67) group I PapG M20181 190 76 IA2 (48) group II PapG X61238 268 77 O4: K12 J96 (48) group III pai AF081286 922 78 h140 8550 (54) rfbO9 D43637 501 79 O9: F6 K P81- This 603A study RfbO101 X59852 500 80 O101 h510a This study RfbO111 AF078736 406 81 O111 H87-5457 (75) RfbE S83460 292 82 O157: H7 EDL933 (43) O157 RfbE S83460 259 83 O157: H7 STJ348 (75) O157 H7 Rfc O4 U39042 786 84 O4: K12 J96 (54) rtx AE005229 521 85 O157: H7 EDL933 This study sfaDE X16664 408 86 4787 (62) sfaA X16664 500 87 4787 This study stah M29255 201 88 H-10407 This study stap M58746 163 89 O149: K91 P97- (81) 2554B stb M35586 368 90 O149: K91 P97- (58) 2554B stx1 L04539 583 91 O157: H7 EDL933 (35) stx2 AF175707 779 92 O157 KNIH317 (35) stxA I M23980 502 93 O157: H7 EDL933 This study stxA Y10775 482 94 O157: H7 EDL933 This II study stxB I M23980 151 95 O157: H7 EDL933 This study stx B Y10775 211 96 O157: H7 EDL933 This II study stxB M36727 226 97 O101 h510a This III study tir AF045568 442 98 RDEC-1B This group I study tir AF070067 479 99 O157: H7 EDL933 This group study II tir AB036053 443 100 O126: H6 E2348/69 This group study III traT J01769 288 101 3292 (54) tsh AF218073 640 102 O78: K80 Av 89- (26) 7098(143) uidA S69414 250 103 O157: H7 EDL933 (17) uspA AB027193 501 104 h140 8550 This study

[0097] TABLE 2 DNA sequences of primers designed in this study. SEQ SEQ ID ID Gene Forward NO: Reverse NO: afaE5 GCGATCATGGCCGCGACC 105 CAACTCACCCAGTAGCC 106 AGCA CCAGT cdt2 GAAAGTAAATGGAATATA 107 TTTGTGTTGCCGCCGCT 108 AATG GGTGAA cdt3 GAAAGTAAATGGAATATA 109 TTTGTGTCGGTGCAGCA 110 AATG GGGAAA cfaI GGTGCAATGGCTCTGACC 111 GTCATTACAAGAGATAC 112 ACA TACT cs1 GCTCACACCATCAACACC 113 CGTTGACTTAGTCAGGA 114 GTT TAAT cs3 GGGCCCACTCTAACCAAA 115 CGGTAATTACCTGAAAC 116 GAA TAAA derb122 CGTGTGGGAGCCCTGAGC 117 CCGGCCTGGTTGCTAGT 118 CTT ATT espA CATCAGTTGCTAGTGCGA 119 CAGCAAATGTCAAATAC 120 group ATG GTT I espA CGACATCGACGATCTATG 121 CCAAGGGATATTGCTGA 122 group ACT AATA II espA CATCAGTTGCTAGTGCGA 123 CAGCAAATGTCAAATAC 124 group ATG GTT III espB CGGAGAGTACGACCGGCG 125 GCACGGCTGGCTGCTTT 126 group CTT CGTT I espB GCTGCCATTAATAGCGCA 127 TATTGTTGTTACCAGCC 128 group ACT TTGC II espB GTAATGACGGTTAATTCT 129 GCCGCATCAATAGCCTT 130 group GTT AGAA III espC CCCATAACGGAACAACTC 131 CAGAATAGACCAAACAT 132 AT CTGCA etpD GGCCACTTTCAATGTTGG 133 CGACTGCACCTGTTCCT 134 TCA GATTA invX TCTGATATAGTTTATATG 135 TCAAACCCCACTCTTAA 136 GGT TTAA ipaC TTGCAAAAGCAATTTTGC 137 TGCCGAACAATGTTCTC 138 AAC TGCA kfiB AATTGTTTTAAAATCTGT 139 TGAGACTGAAATTACAT 140 TCT TTAA leoA GAACAATTCAAACAGTTC 141 TTATTCAAATCGCGCAA 142 AGT TACC lngA CAAATACAGTCCGCGTAC 143 CCATTGTTACCTAAAGA 144 GA GCGT neuC TTGGCAGTTACAGGAATG 145 AACAGTGAACCATATTT 146 CAT TAGT paa ATGAGGAACATAATGGCA 147 TCTGGTCAGGTCGTCAA 148 GG TAC rfbO9 GGTGATCGATTATTCCGC 149 ACGCCTCATCGGTCAGC 150 TGA GCCT rfbO101 TCTGCACGTTTAAAATTA 151 GTTTCTCCGTCAGAATC 152 TTG AAGC rtx CTACCGTAGCGGGCGATG 153 CAGCGCCTGTCCGTGTT 154 GTA CGGC sfaA CCCTGACCTTGGGTGTTG 155 GTACTGAACTTTAAAGG 156 CGA TGG stah AAGAAATCAATATTATTT 157 AATAGCACCCGGTACAA 158 AT G stxA I GCGAAGGAATTTACCTTA 159 CAGCTGTCACAGTAACA 160 GA AAC stxA II CTTGAACATATATCTCAG 161 ACAGGAGCAGTTTCAGA 162 GG CAGT stxB I GGTGGAGTATACAAAATA 163 ATGACAGGCATTAGTTT 164 TAA TAAT stx B TTCTGTTAATGCAATGGC 165 TTCAGCAAATCCGGAGC 166 II GG CTGA stxB GAAGAAGATGTTTATAGC 167 ACTGCAGGTATTAGATA 168 III GG TGAT tir ATTGGTGCCGGTGTTACT 169 CTCCCATACCTAAACGC 170 group GCTG AAT I tir ATTGGTGTTGCCGTCACC 171 ACGCCATGACATGGGAG 172 group GCT G II tir ATTGGTGCTGGTGTAACG 173 ATTGCGTTTAGGTATGG 174 group ACT G III uspA CTACTGTTCCCGAGTAGT 175 GGTGCCGTCCGGAATCG 176 GTG GCGT

[0098] TABLE 3 Pathotype grouping of E. coli virulence genes Pathotype Pathotype-specific virulence genes UPEC sfaA; sfaDE; clpG; iutA; nfaE; pai; iroN; cvaC; kpsMT2; kpsMT3; hlyA; hlyC; focG; afaD8; bmaE; cs31A; drb122; kfiB; afa3; afa5; afaE7; papEF; papC; papGI; papGII; papGII; papAH ETEC lngA; sth; stp; stb; It; F18; F41; leoA; rfbO101; F5; F6; F17A; F17G; cfaI; csl; cs3; F4 EPEC bfpA; eaf; espC EHEC ehxA; etpD; katP; L9075; rfbEO157; rfbO111; rfbO157H7; rtx; stx1; stx2; stxA1; stxA2;; StxB1; StxB2; Stx3A EPEC and eae; espP; espA1; espA2; espA3; paa; EHEC (i.e. espB1; espB2; espB3; tir1; tir2; tir3; common to espC both) DAEC aida EAEC aggA; aggC EIEC ipaC;. invX CDEC cdt1; cdt2; cdt3; cnf1; cnf2 MENEC rfcO4; iucD; ibe10; neuC; nfbO9

[0099] Throughout this application, various references are referred to describe more fully the state of the art to which this invention pertains. The disclosures of these references are hereby incorporated by reference into the present disclosure.

REFERENCES

[0100] 1. An, H., et al. (1999) Adv Exp Med Biol. 473:179-84.

[0101] 2. Anderson, J. D., et al. (1987) Pediatr Infect Dis J. 6:1135-6.

[0102] 3. Bach, S., et al. (2000) FEMS Microbiol Lett. 183:289-94.

[0103] 4. Beaudry, M., et al. (1996) J Clin Microbiol. 34:144-8.

[0104] 5. Benz, I., et al. (1992) Mol Microbiol. 6:1539-46.

[0105] 6. Bertani, G. (1951) J. Bacteriol. 62:293-300.

[0106] 7. Bertin, Y., et al. (1998) FEMS Microbiol Lett. 162:235-9

[0107] 8. Beutin, L. (1999) Vet Res. 30:285-98.

[0108] 9. Bingen, E., et al. (1998) J Infect Dis. 177:642-50.

[0109] 10. Bloch, C. A., et al. (1996) Infect Immun. 64:3218-23.

[0110] 11. Blum, G., et al. (1994) Infect Immun. 62:606-14.

[0111] 12. Boyd, E. F., et al. (1998) J Bacteriol. 180:1159-65.

[0112] 13. Brunder, W., et al. (1999) Microbiology. 145:1005-14.

[0113] 14. Brunder, W., et al. (1996) Microbiology. 142:3305-15.

[0114] 15. Burland, V., et al. (1998) Nucleic Acids Res. 26:4196-204.

[0115] 16. Call, D. R., et al. (2001) Int J Food Microbiol. 67:71-80.

[0116] 17. Cebula, T. A., et al. (1995) J Clin Microbiol. 33:248-50.

[0117] 18. Chizhikov, V., et al. (2001) Appl Environ Microbiol. 67:3258-63.

[0118] 19. Cho, J. C., et al. (2001) Appl Environ Microbiol. 67:3677-82.

[0119] 20. Cid, D., et al. (1999) J Clin Microbiol. 37:1370-5.

[0120] 21. Clegg, S. (1982) Infect Immun. 38:739-44.

[0121] 22. Daigle, F., et al. (1994) Can J Microbiol. 40:286-91.

[0122] 23. Dallas, W. S., et al. (1979) J Bacteriol. 139:850-8.

[0123] 24. De Boer, E., et al. (1999) Int J Food Microbiol. 50:119-30.

[0124] 25. Dozois, C. M., et al. (1997) FEMS Microbiol Lett. 152:307-12.

[0125] 26. Dozois, C. M., et al. (2000) Infect Immun. 68:4145-54.

[0126] 27. Elliott, S. J., et al. (1998) Mol Microbiol. 28:1-4.

[0127] 28. Evans, D. G., et al. (1978) Infect Immun. 21:638-47.

[0128] 29. Evans, D. G., et al. (1977) J Infect Dis. 136 Suppl:S118-23.

[0129] 30. Evans, D. J., Jr., et al. (1976) J Infect Dis. 133 Suppl:97-102.

[0130] 31. Feng, P., et al. (2000) Mol Cell Probes. 14:333-7.

[0131] 32. Franke, J., et al. (1994) J Clin Microbiol. 32:2460-3.

[0132] 33. Furrer, B., et al. (1990) Lett Appl Microbiol. 10:31-4.

[0133] 34. Gannon, V. P., et al. (1997) J Clin Microbiol. 35:656-62.

[0134] 35. Gannon, V. P., et al. (1992) Appl Environ Microbiol. 58:3809-15.

[0135] 36. Garcia, E., et al. (1988) Antonie Van Leeuwenhoek. 54:149-63.

[0136] 37. Girardeau, J. P., et al. (1991) J Bacteriol. 173:7673-83.

[0137] 38. Groisman, E. A., et al. (1996) Cell. 87:791-4.

[0138] 39. Gunzburg, S. T., et al. (1995) J Clin Microbiol. 33:1375-7.

[0139] 40. Hacker, J., et al. (1990) Microb Pathog. 8:213-25.

[0140] 41. Harel, J., et al. (1993) Vet Microbiol. 38:139-55.

[0141] 42. Herrero, M., et al. (1988) J Bacteriol. 170:56-64.

[0142] 43. Hu, Y., et al. (1999) J Appl Microbiol. 87:867-76.

[0143] 44. Huang, S. H., et al. (1995) Infect Immun. 63:4470-5.

[0144] 45. Imberechts, H., et al. (1992) Infect Immun. 60:1963-71.

[0145] 46. Ito, H., et al. (1990) Microb Pathog. 8:47-60.

[0146] 47. Johnson, J. R., et al. (1998) J Infect Dis. 177.

[0147] 48. Johnson, J. R., et al. (1996) J Infect Dis. 173:920-6.

[0148] 49. Johnson, J. R., et al. (2001) J Infect Dis. 183:78-88.

[0149] 50. Johnson, J. R., et al. (2001) J Infect Dis. 183:1508-17.

[0150] 51. Johnson, J. R., et al. (2000) Infect Immun. 68:3327-36.

[0151] 52. Johnson, J. R., et al. (2002) J Infect Dis. 185:774-84.

[0152] 53. Johnson, J. R., et al. (2000) Infect Immun. 68:3040-7.

[0153] 54. Johnson, J. R., et al. (2000) J Infect Dis. 181:261-72.

[0154] 55. Johnson, J. R., et al. (2001) J Infect Dis. 183:897-906.

[0155] 56. Johnson, W. M., et al. (1988) Microb Pathog. 4:103-13.

[0156] 57. Johnson, W. M., et al. (1987) FEMS Microbiol Lett. 43:19-23.

[0157] 58. Knutton, S., et al. (1987) Infect Immun. 55:69-77.

[0158] 59. Kuhnert, P., et al. (2000) FEMS Microbiol Rev. 24:107-17.

[0159] 60. Kurazono, H., et al. (2000) Microb Pathog. 28:183-9.

[0160] 61. Lalioui, L., et al. (1999) Infect Immun. 67:5048-59.

[0161] 62. Le Bouguenec, C., et al. (1992) J Clin Microbiol. 30:1189-93.

[0162] 63. Li, J., et al. (2001) J Clin Microbiol. 39:696-704.

[0163] 64. Makino, S., et al. (1999) Epidemiol Infect. 123:25-30.

[0164] 65. Marc, D., et al. (1996) J Med Microbiol. 44:444-52.

[0165] 66. Martin, C., et al. (1997) Res Microbiol. 148:55-64.

[0166] 67. Mitsumori K, T. A. et al. (1998) FEMS Immunol Med Microbiol. 21:261-8.

[0167] 68. Moseley, S. L., et al. (1983) J Bacteriol. 156:441-3.

[0168] 69. Muhldorfer, I., et al.(1994) Microb Pathog. 16:171-81.

[0169] 70. Murray, A. E., et al. (2001) Proc Natl Acad Sci U S A. 98:9853-8.

[0170] 71. Nataro, J. P., et al. (1998) Clin Microbiol Rev. 11:142-201.

[0171] 72. Ojeniyi, B. et al. (1994) Zentralbl Veterinarmed [B]. 41:49-59.

[0172] 73. Orndorff, P. E., et al. (1984) J Bacteriol. 159:736-44.

[0173] 74. Oswald, E., et al. (1994) J Med Microbiol. 40:428-34.

[0174] 75. Paton, A. W., et al. (1998) J Clin Microbiol. 36:598-602.

[0175] 76. Perna, N. T., et al. (2001) Nature. 409:529-33.

[0176] 77. Pfaff-McDonough, S. J., et al. (2000) Avian Dis. 44:23-33.

[0177] 78. Rich, C., et al. (1999) FEMS Microbiol Lett. 173:55-61.

[0178] 79. Russo, T. A., et al. (1999) Infect Immun. 67:5306-14.

[0179] 80. Sandhu, K. S., et al. (1997) Adv Exp Med Biol. 412:295-302.

[0180] 81. Savarino, S. J., et al. (1993) Proc Natl Acad Sci U S A. 90:3093-7.

[0181] 82. Schmidt, H., et al. (1997) FEMS Microbiol Lett. 148:265-72.

[0182] 83. Starcic, M., et al. (2002) Vet Microbiol. 85:361-77.

[0183] 84. Strockbine, N. A., et al. (1988) J Bacteriol. 170:1116-22.

[0184] 85. Wasteson, Y., et al. (1988) J Clin Microbiol. 26:2564-6.

[0185] 86. Welch, R. A., et al. (1983) Infect Immun. 42:178-86.

[0186] 87. Wilfert, C. M. (1978) Annu Rev Med. 29:129-36.

[0187] 88. Yamamoto S, et al. (1995) FEMS Immunol Med Microbiol. 12:85-90.

[0188] 89. Yamamoto, S. et al. (1995) FEMS Immunol Med Microbiol. 12:85-90.

[0189] 90. Yamamoto, T. et al. (1996) Infect Immun. 64:1441-5.

[0190] 91. Yamasaki, S. et al. (1996) Microbiol Immunol. 40:345-52.

[0191] 92. Zhu, C. et al. (2001) Infect Immun. 69:2107-15.

1 176 1 793 DNA Escherichia coli 1 catcaagctg tttgttcgtc cgccggcggt gaaggggcga ccggatgatg tggccggcaa 60 ggtggagtgg cagagggccg gcaacaggct gaagggggtt aacccgacgc cgttttacat 120 caacctgtcc acgctgacgg tggggggtaa ggaagtgaag gagcgtgaat atattgcgcc 180 gttttcctcc cgtgaatatc cgctgcctgc ggggcatcgg gtaaggttca gtggaaggtg 240 ataacggatt acggcgggac cagtaagcag tttgaggcag agctgaaggg ttgaatacat 300 aaggtgataa cagggtaaat gacgggctga cagatgcgtg atacttcttc agggcggatg 360 agaacggggg tgacagggct ggcgctggct gtgatggtgg cctgtgtgat gtttcgtgcg 420 gagagtggta ttgcgcgcac ctactccttt gatgcggcca tgctgaaagg tggcgggaag 480 ggggtggacc tgaccctgtt tgaggaaggt gggcagttac ccggcattta tccggttgac 540 attatcctga atggttcccg tgtggattca caggagatgg cctttcacgc ggagagggac 600 gcggagggca ggccttatct gaagacctgt ctgacccgtg agatgctggc gcgttacggg 660 gtcaggattg aggaatatcc ggcgttgttc cgtgcatccg gagagggtcg tggtgcctcc 720 gtggcggagg aggcctgtgc tgacctgacg gcgataccgc aggccacgga gagttatcag 780 tttgctgccc agc 793 2 470 DNA Escherichia coli 2 gcgatcatgg ccgcgaccag cactatcctc gcgatgagct cctcgcatgc agcgttcaca 60 ggaagtggta gcaccggtac gacaaaacta accgttaccg aacagtgcca agtgctggtc 120 accggatctg acgtcaccaa aacgcgcgga gaactcaccg acggggcccg tgtgggggtc 180 ctgtccgtaa ccgcaaaagg ctgtaacacc gagcatgcag cgttgcgtgc acagccagac 240 aactaccacc agggcaagat cgtactgatc cgcgatgact atcaggcacg gataaatgtc 300 cgcttgcagg ccaccgacgg gcgtgcgtgg aataccaacg gcgacaccgt ataccgcgcc 360 gatgctggga actggggtgg cagcttgttc gtagtcgtgg acggggacaa cgtggacaaa 420 ccgaccgggt cctacacact gaacctggac tggggctact gggtgagttg 470 3 618 DNA Escherichia coli 3 gctaaatcaa ctgttgatgt tgcaacggat tctgttgata catcctttat catccgggac 60 gactgtgcga tttctgttac tgcgcagtct ccaaaaacgt ttaccttgag tgaagtcaaa 120 aatcatgtta gagcagcaga tattaccatt acacctacat gtggtggtaa atatttatgg 180 gcagaaatga aagaggttga ctctcaggga tttggtattg cgagaactga tgctggtgat 240 cttgcttcta ttacttgggt acaggatggg aattgggatg ctggtgaagg tgagagagtg 300 gctaagacaa ccctaccgac cgtggcttct caaggggttc cttatcctgc agtcttcgta 360 actcagggga gtacgggtta ccgaagaaag ctggagaata caaatttggc ctcactgttg 420 gctattgggt ggagtgagtt agaccgcaaa aattagtgtt ctctacaaag catactgatt 480 gattaacttt caaggaagtt cattatgcag ggatgcagtc aagtgcaaaa tcgtatcgtg 540 attgttacca acagtgtgaa agtgactctt ctgttagcgc cagtgatata agacggtaat 600 tcgccatttg gattgtcc 618 4 351 DNA Escherichia coli 4 gttgaactga gtcttaatac cagtgatgga aggagtggcg agttaaaaga cggtacgaag 60 gtggcaacag gaaggattat ctgccgaggc acctatacaa gttttcatat ctggatgaat 120 agcagacaaa tgggaaatat tcctggtcac tatattatac tgggtagaca tgacagtcat 180 aatgaaatgc gggttaggct ggatggcgca ggatggttgc catcggtaag tgatgggcaa 240 ggtatggtca gtaccgggat acctgagcag catacatttg atgttgtgat tgacggaaat 300 cagctgcttg ggcctgatga atatatatta tcagttagcg gagaatgctc a 351 5 432 DNA Escherichia coli 5 gcgttagaaa gacctccaat aaaagcaact gagacaatcc gcctcaccgt tacaaatgat 60 tgtcctgtta ctatagctac aaatagtcca ccaaatgttg gtgtatcgtc aacaacacca 120 ataatattta acgcaacagt aacgacgaca gagcaatgtg ctaaaagcgg tgcaagggtc 180 tggttatggg gaacaggtgc cgctaataag tgggtcctag agcatactac aaatacaaaa 240 caaaaataca cattaaatcc atctatagat ggaaattcat atttccagac tccaggaact 300 aatgcagcaa tttataaaaa tgtgacaacc agagacagag ttctgaaggc aagtgtcaag 360 gttgacccta aaattcaagt attaatacca ggcgaatata gaatgatact ccatgccgga 420 attaattttt aa 432 6 528 DNA Escherichia coli 6 tattaaacca tggtagcggg gggatagact taactctact tgagaaagga gggcagttgc 60 ctggtattta tccggttgat ataattttaa atggttcgcg tattgattca agggatatat 120 tcttttacac aaaaaaaaat aggcatggtg aatattacct gaaaccctgt ttaactcgag 180 atattttgat taattacgga gtaaaaacag aagaataccc taatcttttc cggcaaaata 240 gtgaaaaaaa tagagacagt agcgattgtg ctgacttatc agtgatcccc caagctacag 300 aagactatca ttttataaaa cagcaactaa tactcggaat tccacaagtt gcgatccgcc 360 caccattgac aggcattgcc catgaaacaa tgtgggatga tggtatatca gcatttttgt 420 tgaactggca agtagagggg agtcattggg agtatagaag taatactcga aattcttcag 480 acaatttttg ggccagtttg gaacctggaa tcaatctcgg atcttggc 528 7 586 DNA Escherichia coli 7 acagtatcat atggagccac tccagacagg cctggattgt ggcctcagag ttagccagag 60 gacatggttt tgtccttgca aaaaatacac tgctggtatt ggcggttgtt tccacaatcg 120 gaaatgcatt tgcagtaaat atttcaggca cagtatcttc aggaggaact gtttcttccg 180 gcgaaacaca aatcgtgtat tccggtcggg gaaacagtaa tgccactgta aatagtggag 240 gaacacaaat cgtcaataat ggtgggaaaa ccactgctac aactgttaat agttcaggaa 300 gccagaacgt cgggacttca ggagcaacaa taagcacaat tgtcaattct ggtggcattc 360 agcgagtcag ttcaggtggt gtggcctctg caacaaattt aagtggcggg gctcagaaca 420 tctataatct tggccatgca tcaaataccg ttatttttag cggtggaaat cagacgattt 480 tttcaggagg tataactgat agtacaaata tcagctccgg tggccaacag cgtgtcagta 540 gtggtggcgt tgcctcgaac accaccatta atagttctgg cgcaca 586 8 324 DNA Escherichia coli 8 aatggtgctt gcgcttgctg ccaccgttac cgcaggtgtg atgttttact accagtctgc 60 gtctgattcc aataagtcgc agaatgctat ttcagaagta atgagcgcaa cgtctgcaat 120 taatggtctg tatattgggc agaccagtta tagtggattg gactcaacga ttttacttaa 180 cacatctgca attccggata attacaaaga tacaacaaac aaaaaaataa ccaacccatt 240 tgggggggaa ttaaatgtag gtccagcaaa caataacacc gcatttggtt actatctgac 300 gcttaccagg ttggataaag cggc 324 9 505 DNA Escherichia coli 9 atggcgctaa cttgccatgc tgtgacagta acagccactc atacagttga atcagatgct 60 gaattcacaa tagattgggt cgacgctggg ccaacgacta cagatgcaaa agatggtgag 120 gtttgggggc accttgatat gactcaaacc aggggaacac caacattcgg aaaactccgc 180 aatcctcaag gagagacttc gccaggaccg ttgaaggcgc cattcagttt taccgggcca 240 gatggtcata ctgcaagagc gtaccttgat tcatacggcg caccgattca caactacgca 300 ggggataacc ttgctaatgg ggtgaaggta ggtagtggaa gcggaaacac tccatttgtt 360 gttgggacag caagtcgact aactgcaaga atcttcggag accagacatt ggttccagga 420 gtctaccgga caacctttga attaactact tggaccgact gacggaaaat taacctgatg 480 aaagaagggg gctatatgtc cccct 505 10 412 DNA Escherichia coli 10 caatagtcgc ccacaggagt tgtttatata tttctcacgt gttgatgcat tcgctaacag 60 agtaaatctt gcgattgttt caaacagaag agctgatgag gtgattgtat tacctcctcc 120 aactgttgta tcacgaccga tcatcggcat tagaattggt aatgatgttt tcttctcaac 180 ccatgcattg gcgaatcggg gcgtggattc aggagcaatt gtaaatagtg tttttgagtt 240 cttcaacaga caaacggatc ctataagaca ggccgctaac tggatgattg caggagattt 300 taaccgttca ccggctacac tattttcaac tcttgaacca gggattcgca atcatgtaaa 360 tattattgct ccaccagatc caacgcaagc cagtggtggt gttcttgatt at 412 11 556 DNA Escherichia coli 11 gaaagtaaat ggaatataaa tgtccggcaa ttaatttctg gtgaaaatgc tgtagacatt 60 ttagctgtac aagaggcagg ctctccgccg tcaacggctg tagatacagg tacacttatt 120 ccttccccag gaattcccgt ccgagagctt atctggaact tgtcgacaaa tagcaggcca 180 cagcaagtat atatatattt ttccgctgtt gatgccctcg gtggaagagt caatcttgct 240 ctggttagca atcggcgggc cgatgaagtg tttgttctta gtcctgtaag acaaggtgga 300 cgaccattgc ttggcatacg aattggtaat gatgcatttt tcactgcaca cgccatagct 360 atgcgaaaca atgatgcccc ggctcttgtt gaggaagtgt ataacttctt ccgcgacagc 420 agagacccag tacaccaggc gcttaactgg atgattcttg gtgatttcaa ccgtgaacct 480 gcggatttag agatgaacct tactgttccc gtaagaaggg catcagaaat tatttcacca 540 gcggcggcaa cacaaa 556 12 556 DNA Escherichia coli 12 gaaagtaaat ggaatataaa tgtccgacaa ttaatttctg gtgaaaatgc cgtagatatt 60 ttagctgtgc aggaggcagg ttctccgcca tcaacggctg tagatacagg tagagttatt 120 ccttccccag gcattcctgt ccgggagctt atctggaact tgtctacaaa tagcagacca 180 cagcaagtat atatatattt ttctgctgtt gatgcctttg gtggaagggt caatcttgct 240 ctggttagca atcggcaggc cgatgaagtg tttgttcttc gcccggtaag gcaaggtggg 300 cggccattgc ttggcatacg gattggcaat gatgcatttt tcactgcaca tgcgatagct 360 acgcgaaaca atgacgctcc cgctcttgtt gaagaagtct atagcttctt tcgtgacagc 420 cgagacccag tccaccaggc cattaactgg atgattcttg gtgattttaa tcgcgaacct 480 gatgatttag aggtgaacct tacagttcct gtaagaaatg catcagaaat tattttccct 540 gctgcaccga cacaaa 556 13 479 DNA Escherichia coli 13 ggtgcaatgg ctctgaccac aatgtttgta gcagtgagtg cttcagcagt agagaaaaat 60 attactgtaa cagctagtgt tgatcctgca attgatcttt tgcaagctga tggcaatgct 120 ctgccatcag ctgtaaagtt agcttattct cccgcatcaa aaacttttga aagttacaga 180 gtaatgactc aagttcatac aaacgatgca actaaaaaag taattgttaa acttgctgat 240 acaccacagc ttacagatgt tctgaattca actgttcaaa tgcctatcag tgtgtcatgg 300 ggaggacaag tattatctac aacagccaaa gaatttgaag ctgctgcttt gggatattct 360 gcatccggtg taaatggcgt atcatcttct caagagttag taattagcgc tgcacctaaa 420 actgccggta ccgccccaac tgcaggaaac tattcaggag tagtatctct tgtaatgac 479 14 403 DNA Escherichia coli 14 gggcgctctc tccttcaaca acactatcaa ggaaatgaca ggtgacagta agctgctgac 60 catcactcag tctgaaccag ctcctattct tttagggcgc acaaaagagg cgtttgcagc 120 atcgattgtt ggtgttggtg caattccttt aattgcgttc agtgattatg aagggaacgg 180 agttgcctta cagagttctg gggataacgg taaggggttc tttgaattgc ccatgaaaga 240 tgatagtgga aataatctcg gtagcgtaaa agttaatgtt acttctgctg gcctgttttc 300 ctatagtgaa atatcaacag gtttagttgg tataacttct gttgccagtg gcgataatac 360 aagtatttat tatggtggtc tggtgtcgcc agcaattagg gcg 403 15 1112 DNA Escherichia coli 15 gggggaagta cagaagaatt acacgaaatt ttgttaggtc agggcccaca gtcaagctta 60 ggttttactg aatatacctc aaatgttaac agtgcagatg cagcaagcag acgacacttt 120 ctggtagtta taaaagtgca cgtaaaatat atcaccaata ataatgtttc atatgttaat 180 cattgggcaa ttcctgatga agccccggtt gaagtactgg ctgtggttga caggagattt 240 aattttcctg agccatcaac gcctcctgat atatcaacca tacgtaaatt gttatctcta 300 cgatatttta aagaaagtat cgaaagcacc tccaaatcta actttcagaa attaagtcgc 360 ggtaatattg atgtgcttaa aggacgggga agtatttcat cgacacgtca gcgtgcaatc 420 tatccgtatt ttgaagccgc taatgctgat gagcaacaac ctctcttttt ctacatcaaa 480 aaagatcgct ttgataacca tggctatgat cagtatttct atgataatac agtggggcta 540 aatggtattc caacattgaa cacctatact ggggaaattc catcagactc atcttcactc 600 ggctcaactt attggaagaa gtataatctt actaatgaaa caagcataat tcgtgtgtca 660 aattctgctc gtggggcgaa tggtattaaa atagcacttg aggaagtcca ggagggtaaa 720 ccagtaatca ttacaagcgg aaatctaagt ggttgtacga caattgttgc ccgaaaagaa 780 ggatatattt ataaggtaca tactggtaca acaaaatctt tggctggatt taccagtact 840 accggggtga aaaaagcagt tgaagtactt gagctactta caaaagaacc aatacctcgc 900 gtggagggaa taatgagcaa tgatttctta gtcgattatc tgtcggaaaa ttttgaagat 960 tcattaataa cttactcatc atctgaaaaa aaaccagata gtcaaatcac tattattcgt 1020 gataatgttt ctgttttccc ttacttcctt gataatatac ctgaacatgg ctttggtaca 1080 tcggcgactg tactggtgag agtggacggc aa 1112 16 1241 DNA Escherichia coli 16 tatcatacgg caggaggaag caccgaagaa ttgcatgaga ttttgttggg gcaaggccca 60 cagtcgagtt taggttttac tgaatatact tcaaatatta acagtgcaga tgcggcaagc 120 agacgacatt ttcttgtagt cataaaagtg caagtgaaat atataaacaa taataacgtt 180 tcgcatgtta atcactgggc aattcctgat gaggctccag tagaagtact ggctgtggtt 240 gacaggagat ttaatttccc tgagccatca actccaccta atatatcaat tatacacaag 300 ttgttatctc tgagatattt taaagaaaat atcgaaagta catcaaggct taacttacag 360 aaattaaatc gtggtaatat tgatatattt aaagggaggg ggagtatttc atcaacacgt 420 cagcgtgcga tttatccgta ttttgaatct gctaatgctg atgagcaaca acctgtcttt 480 ttctacatca aaaaaaaccg gtttgatgac tttggctatg atcaatattt ctataatagt 540 acagtggggt tgaatggtat tcccacattg aacacctata ctggagaaat tctatcagac 600 gcatcctcgc tcggctcaac ttattggaaa aagtataatc tcactaatga aacaagcatc 660 attcgtgtat caaattctgc tcgaggggca aatggtataa aaatagcact tgaagaagtg 720 caggaaggta aaccggtaat cattacaagc ggaaatttga gcggttgtac aacaattgtt 780 gctcgaaaag gaggatacct ttataaggta catacaggta caacaatacc tttagctggt 840 tttacaagta caacaggggt aaaaaaagct gtagaagttt ttgaattact tacaaataat 900 ccaatgccgc gcgtagaggg agtaatgaat aatgattttt tggtaaatta tctggcggaa 960 agttttgatg agtctttaat aacgtactca tcatctgaac aaaaaatagg tagtaagatt 1020 actatttctc gcgacaatgt ttctactttt ccttactttc ttgataacat accagaaaaa 1080 ggctttggta catcggtgac tatattggta agagtagatg gtaatgttat cgtaaaatcc 1140 ttatctgaga gttattcttt aaatgtagaa aactccaata tatcagtatt gcatgttttt 1200 tcaaaagatt tttgattcgg aaaattattg tctattgtga c 1241 17 321 DNA Escherichia coli 17 gctcacacca tcaacaccgt tgttcataca aatgactcag ataaaggtgt tgttgtgaag 60 ctgtcagcag atccagtcct gtccaatgtt ctgaatccaa ccctgcaaat tcctgtttct 120 gtgaatttcg caggaaaacc actgagcaca acaggcatta ccatcgactc caatgatctg 180 aactttgctt cgagtggtgt taataaagtt tcttctacgc agaaactttc aatccatgca 240 gatgctactc gggtaactgg cggcgcacta acagctggtc aatatcaggg actcgtatca 300 attatcctga ctaagtcaac g 321 18 401 DNA Escherichia coli 18 gggcccactc taaccaaaga actggcatta aatgtgcttt ctcctgcagc tctggatgca 60 acttgggctc ctcaggataa tttaacatta tccaatactg gcgtttctaa tactttggtg 120 ggtgttttga ctctttcaaa taccagtatt gatacagtta gcattgcgag tacaagtgtt 180 tctgatacat ctaagaatgg tacagtaact tttgcacatg agacaaataa ctctgctagc 240 tttgccacca ccatttcaac agataatgcc aacattacgt tggataaaaa tgctggaaat 300 acgattgtta aaactacaaa tgggagtcag ttgccaacta atttaccact taagtttatt 360 accactgaag gtaacgaaca tttagtttca ggtaattacc g 401 19 731 DNA Escherichia coli 19 gcacaattac tgctgatgcg tataaagaca aatgggaatg gatggttggg ggcgctctct 60 ccttcaacaa cactatcaag gaaatgacag gtgacagtaa gctgctgacc atcactcagt 120 ctgaaccagc tcctattctt ttagggcgca caaaagaggc gtttgcagca tcgattgttg 180 gtgttggtgc aattccttta attgcgttca gtgattatga agggaaagga gttgccttac 240 agagttctgg ggataacggt aaggggttct ttgaattgcc catgaaagat gatagtggaa 300 ataatctcgg tagcgtaaaa gttaatgtta cttctgctgg cctgttttcc tatagtgaaa 360 tatcaacagg tttagttggt ataacttctg ttgccagtgg cgataataca agtatttatt 420 atggtggtct ggtgtcgcca gcaattaggg cgggtaaaga cgcagcatca gctgtgtcga 480 aatttggcaa ctataatcat acacaattgc tgggccagct tcaagcagta aaccctaacg 540 cgggcaatag aggacaagta aataaaaata gtgcggtctc acaaaatatg gtgatgacta 600 ctggtgatgt aattgcatcc tcttacgcac ttggtattga ccagggacag actattgaag 660 caacctttac taatcctgtg gttagcacca cccagtggag tgctccgctg aacgtggcag 720 taacttataa c 731 20 677 DNA Escherichia coli 20 cacacacaaa cgggagctgt ttgtagcgaa gccactcgtt caaatcaatt ctcttgacgt 60 ggggaaatcc gttttccaag cggacccctt atagggggtt gagggcctcc tacccttcac 120 tcttgactat gttaacgata atcattatcg ttagtgtttg tgtggtaatg ggatagaaag 180 taatgggata aaaagtaatg gatagaaaaa gaacaaaatt agagttgtta tttgcattta 240 taataaatgc caccgcaata tatattgcat tagctatata tgattgtgtt tttagaggaa 300 aggacttttt atccatgcat acattttgct tctctgcatt aatgtctgca atatgttact 360 ttgttggtga taattattat tcaatatccg ataagataaa aaggagatca tatgagaact 420 ctgactctaa atgaattaga ttctgtttct ggtggtgctt cagggcgtga tattgcgatg 480 gctataggaa cactatccgg gcaatttgtt gcaggaggaa ttggagcagc tgctgggggt 540 gtggctggag gtgcaatata tgactatgca tccactcaca aacctaatcc tgcaatgtct 600 ccatccggtt tagggggaac aattaagcaa aaacccgaag ggataccttc agaagcatgg 660 aactatgctg cgggaag 677 21 260 DNA Escherichia coli 21 cgtgtgggag ccctgagcct taacgcgaga gggtgtaaca ccgagcatgc agcgctgcgc 60 gctcaagcag acaactacca caatggcaag atcgtactgc tacgtgaaga ccaacaagcg 120 cggataaatg tgcgcttggt ggcatccgat gggggccaat ggactaacga cggcgcaacc 180 acataccgcg acgctgccgg ggactggggc gggagcttgt acgtagtcgt ggacggggac 240 aatactagca accaggccgg 260 22 791 DNA Escherichia coli 22 cattatggaa cggcagaggt taatctgcag agtggtaata actttgacgg tagttcactg 60 gacttcttat taccgttcta tgattccgaa aaaatgctgg catttggtca ggtcggagcg 120 cgttacattg actcccgctt tacggcaaat ttaggtgcgg gtcagcgttt tttccttcct 180 gaaaatatgt tgggctataa cgtcttcatt gatcaggatt tttctggtga taatacccgt 240 ttaggtattg gtggcgaata ctggcgagac tatttcaaaa gtagtattaa cggctatttc 300 cgcatgagcg gctggcatga gtcatacaat aagaaagact atgatgagcg cccagcaaat 360 ggcttcgata tccgttttaa tggctatctg ccatcatacc cggcattagg tgccaagctg 420 atgtatgagc agtattatgg tgataatgtt gctttgttta attctgataa gctgcagtcg 480 aatcctggtg cggcgaccgt tggtgtaaac tatactccga ttcctctggt gacgatgggg 540 atcgattacc gtcatggtac gggtaatgaa aatgatctcc tttactcaat gcagttccgt 600 tatcagtttg ataaaccgtg gtctcagcaa attgagccac aatatgttaa cgagttaaga 660 acattatcag gcagccgtta cgatctggtt cagcgtaata acaatattat tctggagtac 720 aaaaagcagg atattctttc tctgaatatt ccgcatgata ttaatggtac tgaacgcagt 780 acgcagaaga t 791 23 397 DNA Escherichia coli misc_feature (224)..(224) n is a, c, g, or t 23 cagggtaaaa gaaagatgat aagttaacgc ttggagtgat cgaacgggat ccaaatcact 60 gatcgatgtt ccatgcgaat aagtgatcga tcatgtcgga atatccaaaa acccgaaatc 120 accagttgcc acattgaacg gcgctggtga tttcgggttc gtcactttat ggataccatc 180 aacccattcc ccggagaaag tatgggctgg ctaaagtgta gcgncattaa gagcagttat 240 ttagtatttt aatgagtatc gaatctttat atttgcatca ttccgttgtt ggtccgcctt 300 ctgacaagct gtgttggcag aagaaacgtc gttagcggtt cctattttgt tactacctag 360 atatatatca ggtttttgat aatacatggt ccccata 397 24 117 DNA Escherichia coli 24 tcggatgcca tcaacacagt atatccgaag gcccgcatcc agttatgcat cgtgcatatg 60 gtgcacaaca gcctgcgctt cgtgtcatgg aaggactaca aagccgtcac tcgcgac 117 25 159 DNA Escherichia coli 25 gtttattctg gggcaggctc atcagaagta tttgctggtg aaggtcatga taccgtatct 60 tataataaga cggatgttgg taaactaaca attgatgcaa caggagcatc aaaacctggt 120 gaatatatag tttcaaaaaa tatgtatggt gacgtgaag 159 26 525 DNA Escherichia coli 26 ggatacatca actgcaacat cagttgctag tgcgaacgcg agtacttcga catcgacagt 60 ctatgactta ggcagtatgt cgaaagacga agtagttcag ctatttaata aagtcggtgt 120 tttcagtgcg cttctcatgt ttgcctatat gtatcaggca caaagcgatc tgtcgattgc 180 aaagtttgct gatatgaatg aggcatctaa ggagtcaacc acagcccaaa aaatggctaa 240 tcttgtggat gctaaaattg ctgatgttca gagtagttct gacaagaatg caaaagccaa 300 actacctaaa gaagtgattg actatataaa tgatcctcgc aatgacatta cagtaagtgg 360 tattagcgat ctaaatgctg aattaggcgc tggtgatttg caaacggtga aggccgctat 420 ttcggccaaa tcgaataact tgaccacggt agtgaataat agccagcttg aaatacagca 480 aatgtcaaat acgttaaacc tattaacgag tgcacgttct gatat 525 27 523 DNA Escherichia coli 27 cgacatcgac gatctatgac ttaggtaata tgtcgaagga tgaggtggtt aagctatttg 60 aggaactcgg tgtttttcag gctgcgattc tcatgttttc ttatatgtat caggcacaaa 120 gtaatctgtc gattgcaaag tttgctgata tgaatgaggc atctaaagcg tcaaccacgg 180 cacaaaagat ggctaatctt gtggatgcca aaattgctga tgttcagagt agcactgata 240 agaatgcgaa agccaaactt cctcaagacg tgattgacta tataaacgat ccacgtaatg 300 acataagtgt aactggtatt cgtgatctta gtggtgattt aagcgctggt gatctgcaaa 360 cagtgaaggc ggctatttca gctaaagcga ataacctgac aacggtagtg aataatagcc 420 agctcgaaat tcagcaaatg tcgaatacat taaatctctt aacgagtgca cgttctgatg 480 tgcaatctct acaatataga actatttcag caatatccct tgg 523 28 487 DNA Escherichia coli 28 gggaacatgt cgaaagatga agttgttgag ctgtttaaaa aagttggcgt atttcaggct 60 gcgcttatca tgtttgctta tatgtaccag gcacaaagcg agctatctat tgctacatat 120 gcagacatga atgagtcatc taaggaatcc accgaggcac aaaaaatggc caatttggtg 180 gatgccaaga tcgctgaagt tcagtctagt tcggaaaagg ataaaaaggt caaacttcct 240 gacgaagtaa ttagttatat tcaagattca cgaaatggga tttccgtaag tagtgatatt 300 gacatcacca aagagttggg tgctggtgac ctgcaaaccg taaaagctgc tatttcagca 360 aaagcaaata acctgacaac gacggtgaat aataaccagc ttacgttgca gcaaatgtct 420 aatacgctga atttattaac aaatgctcgt tcagatatgc agtcgttgca atatcgaact 480 attcagg 487 29 502 DNA Escherichia coli 29 cggagagtac gaccggcgct tccagtgcag ttgccgcatc tgctttatca attgattcat 60 ctctgcttac tgatggtaag gttgatattt gtaagctgat gctggaaatt caaaaactcc 120 tcggcaagat ggtgactcta ttgcaggatt accaacaaaa acaattggcg caaagctatc 180 agattcagca ggccgttttt gagagccaga ataaagctat tgaggaaaaa aaagccgcgg 240 caaccgctgc tttggttggc gggattattt catcagcatt ggggatctta ggttcttttg 300 cagcaatgaa caacgcggct aaaggggctg gtgagattgc tgaaaaagca agctctgcat 360 cttcaaaggc tgctggtgcg gcttctgagg ttgcaaataa agctctggtc aaggctacgg 420 aaagtgttgc tgatgtcgca gaggaggcat ccagtgcgat gcagaaagcg atggccacaa 480 caacgaaagc agccagccgt gc 502 30 377 DNA Escherichia coli 30 gctgccatta atagcgcaac taaaggcgcg agtgatgtcg ctcagcaagc cgcttctact 60 tctgcgaagt ctatcggtac agtctctgaa gcttcaacta aagcactggc gaaggcttcc 120 gaaggtattg cagatgcagc agatgatgca gctggcgcaa tgcagcaaac tatcgcgaca 180 gctgcaaaag cggccagtcg tacatccggt atcactgatg atgttgctac ttcggctcag 240 aaagcttctc aggtagctga agaggctgct gatgctgctc aagaattagc acagaaggca 300 ggattattaa gtcgctttac tgctgctgcc ggaaggattt ccggttcaac gccatttatt 360 gttgttacca gccttgc 377 31 395 DNA Escherichia coli 31 gtaatgacgg ttaattctgt ttcggagaat actaccggct ctaatgcaat taccgcatct 60 gctattaatt catctttgct taccgatggt aaggtcgatg tttctaaact gatgctggaa 120 attcaaaaac tcctgggcaa gatggtgcgt atattgcagg attaccaaca gcaacagttg 180 tcgcagagct atcagatcca actggccgtt tttgagagcc agaataaagc cattgatgaa 240 aaaaaggccg ctgcaacagc cgctctggtt ggtggggcta tttcatcagt attggggatc 300 ttaggctctt ttgcagcaat taacagtgct acgaaaggcg cgagtgatat tgctcaaaaa 360 accgcctcta catcttctaa ggctattgat gcggc 395 32 500 DNA Escherichia coli 32 cccataacgg aacaactcat catgcaataa gcacccaaaa ctggggacaa agctcatata 60 aatatataga ccggatgacg aatggagatt ttgctgtaac acgacttgat aagtttgttg 120 ttgaaacaac aggggtaaaa aattcagtag atttttctct caatagtcat gatgctcttg 180 aacgttatgg tgtggagatc aatggtgaga aaaaaatcat tggtttcagg gttggggctg 240 ggacgactta taccgttcaa aatggtaata catatagtac aggacaggta tacaatcctc 300 ttttgttaag cgcttcaatg tttcagttaa actgggataa caaaagacca tataataaca 360 cgacaccttt ttataatgaa actaccggtg gagacagtgg ttccggtttc tatctgtatg 420 ataacgtaaa aaaagaatgg gttatgcttg gtactttatt tggaatagca tccagtggtg 480 cagatgtttg gtctattctg 500 33 1830 DNA Escherichia coli 33 aaacagcagg cacttgaacg ttacggggtt aattataaag gagaaaagaa acttatcgca 60 ttcagagccg gctctggtgt ggtatccgtt aaaaaaaatg gacgcataac tccatttaat 120 gaggtttctt ataagccaga aatgttaaat ggctctttcg ttcacattga tgactggagt 180 ggatggctga tattaaccaa caaccagttt gatgagttta ataacattgc ctctcagggt 240 gacagcggtt cagcactgtt cgtctatgat aaccaaaaga aaaagtgggt tgtcgctgga 300 actgtctggg ggatttataa ttacgccaat ggcaaaaacc acgcagcata cagtaaatgg 360 aaccagacaa ccattgacaa cctgaagaac aagtattctt acaacgtgga tatgtcaggg 420 gctcaggttg caaccattga aaatggaaaa ctgacaggca ctggctcaga caccaccgat 480 ataaaaaata aggacttaat atttactggc ggtggagata tcctcctgaa atcctctttt 540 gataatggtg ctggcggtct tgtctttaat gataaaaaga cctatcgagt aaacggggat 600 gatttcacct ttaaaggtgc cggtgttgat acaagaaacg gcagcaccgt tgagtggaat 660 atccggtatg ataataaaga caaccttcac aaaattggtg atggcacatt agatgtccga 720 aaaacccaga acaccaacct gaaaacaggt gagggtcttg tcattcttgg agctgaaaaa 780 acattcaata atatctacat aaccagtggt gatggaactg tccgactgaa tgcagaaaat 840 gcactgtctg gcggtgaata caacggtatt ttctttgcga aaaatggcgg aactcttgac 900 ctgaacggat ataatcagtc tttcaataaa attgctgcaa ctgattcagg tgctgtaata 960 accaatacgt caaccaaaaa atccatttta tccctgaata atactgctga ctatatctat 1020 cacggtaaca taaacgggaa tctggacgta cttcagcatc atgagacgaa aaaagagaac 1080 cgtcgtctta ttcttgatgg gggcgtggac acaacaaatg atataagcct gcgtaataca 1140 caactgtcca tgcagggaca tgccactgaa catgccattt atcgggatgg agctttctct 1200 tgttcactac cagctcctat gcgctttttg tgtggcagtg attatgttgc aggaatgcaa 1260 aatacagaag ctgatgctgt aaaacaaaac ggaaatgcct ataaaaccaa caatgctgtc 1320 tctgatttat cgcagccaga ctgggaaacc ggaacattca gatttggaac gctacatctt 1380 gaaaattccg atttttctgt tggtcgtaat gcaaatgtaa tcggggacat tcaggccagt 1440 aaatcaaaca ttactattgg tgacactaca gcatatattg atttgcatgc tggtaaaaat 1500 attaccggtg atggttttgg cttccgccag aatattgtgc gtggaaactc acaaggagaa 1560 acgctgttta caggagggat cacagcagaa gacagcacta tcgttattaa agataaagca 1620 aaagcattat tttcaaatta tgtatacctg ctgaacacaa aagcaaccat agagaacggt 1680 gctgatgtga caactcaaag tggtatgttc tccacgagcg atatcagcat ctctggtaat 1740 ctgtccatga caggcaatcc cgacaaagac aataaattcg agccctcaat atatctgaat 1800 gatgcttctt atctactgac tgacgactcc 1830 34 499 DNA Escherichia coli 34 ggccactttc aatgttggtc aggaggtccc ggtactttcg ggctcacaga caacctctgg 60 ggacaatatt tttaacacgg tcgagcgcaa aacggtgggg atcaaactca gggtaaaacc 120 ccagatcaac gagggtgatt ccgtgttact ggagatagaa caggaggtgt ccggtgtggc 180 ggacactgca gtagccacca ctactgactt gggagcaacc ttcaacaccc gaacagtgac 240 caatgccatg ctggtcggga atggcgaaac ggtggtggtc ggaggattac tggataagtc 300 gatcaggggg agtgagagta aagtgccact gctgggggat atcccggtac tggggcatct 360 ttttcgcgca aaaagcgaac agacagctaa gcgtaatctg atgctgttca ttcggccaac 420 tattattcgt gagcgcgacg gatttcgtca tgcttcggcc gaaaaatacc agtcgtttaa 480 tcaggaacag gtgcagtcg 499 35 441 DNA Escherichia coli 35 atgcagaaaa ttcaatttat ccttggaata ctggcggctg catcatcttc tgctacgctt 60 gcttatgacg gtaaaattac ttttaatgga aaagttgttg atcaaacttg ttctgttaca 120 acagaaagca agaatttgac agttaagtta ccaactgtct ctgctaattc attagcttca 180 agcggaaaag tggtgggact tactcctttc acaattttgc tggaagggtg caatacgcct 240 gccgtgacag gtgctcagaa tgtaaatgct tatttcgaac ctaatgcgaa cacggattac 300 accactggta atttaactaa tacggcttct tctggtgcat ctaatgttca gattcagcta 360 ctgaatgcag atggggttaa agctattaaa cttggtcagg ctgctgcagc tcagagtgtg 420 gatacagttg ctattaatga t 441 36 950 DNA Escherichia coli 36 tgttggaccg tctcagggct cttattccag cactcatgca atggataacc tgccatttgt 60 ctataatacc ggttacaaca ttggatatca gaatgcaaat gtctggcgta ttagtggcgg 120 gttttgtgtt ggtctggacg ggaaagtgga tttacccgtg gttggcagtc ttgacgggca 180 gagtatttat gggctgacgg aggaggtggg actccttata tggatggggg acacgaatta 240 ttccaggggt accgcgatga gtggaaactc atgggagaat gtcttttccg gatggtgcgt 300 gggaaattat gtatcaacgc agggactgtc tgttcacgta agaccggtaa ttttaaaaag 360 aaattcctct gcgcaataca gtgtacagaa aaccagtatc gggagtatca gaatgaggcc 420 ctataacggt tcatctgcag gcagtgttca gaccacagtg aatttcagcc tgaatccatt 480 tacgctgaat gacacagtaa catcgtgcag attactgaca ccttccgcag tcaatgtcag 540 cctggctgca atttctgccg gacaactacc atcatccggt gatgaagttg tcgccgggac 600 aacatcactg aaattacagt gtgatgccgg agtaacagta tgggcaacac tgactgatgc 660 gaccacaccg tccaacagaa gcgatatact cacactgacg ggggcatcga ctgcaaccgg 720 agtcgggctg agaatataca aaaacactga cagtacgccc ctgaagtttg gacctgattc 780 gccggtaaag ggaaatgaaa accagtggca gttatcgaca ggaacggaaa cgtcaccctc 840 agtccggttg tatgtaaagt atgtgaatac tggtgaggga attaatccgg gtacggttaa 900 cggaatatca acatttacgt tttcctatca gtaacagcga gttccgggag 950 37 510 DNA Escherichia coli 37 gtgaaaagac tagtgtttat ttcttttgtt gcgctgtcca tgacagcggg ttccgcaatg 60 gctcagcaag gggatgttaa attctttggt aacgtatcag caactacctg taatttgaca 120 ccacaaataa gtggcactgt aggagatacc attcagcttg gtactgttgc accaagcgga 180 actggtagtg aaattccttt tgcactgaag gcttcttcaa atgttggcgg ttgtgcttcc 240 ttgtccacta aaacagctga tataacttgg agcgggcagt taaccgaaaa aggttttgct 300 aatcaagggg gggtggcaaa tgattcatat gtcgctctga aaaccgtgaa cggtaaaaca 360 caggggcagg aggttaaggc gtcgaatagc actgtaagtt tcgatgcatc aaaagcaact 420 acggaaggtt tcaaatttac tgctcaactg aaaggtggtc aaaccccggg tgacttccag 480 ggggcagcgg cttacgcggt tacttacaag 510 38 858 DNA Escherichia coli 38 atgaaaaaga ctctgattgc actggcaatt gctgcatctg ctgcatctgg tatggcacat 60 gcctggatga ctggtgattt caatggttcg gtcgatatcg gtggtagtat cactgcagat 120 gattatcgtc agaaatggga atggaaagtt ggtacaggtc ttaatggatt tggtaatgta 180 ttgaatgacc tgaccaatgg tggaaccaaa ctgaccatta ctgttactgg taataagcca 240 attttgttag gccgaaccaa agaagcattt gctacgccag taagtggtgg tgtagatgga 300 attcctcaga ttgcatttac tgactatgaa ggagcttctg taaaactcag aaacactgat 360 ggtgaaacta ataaaggttt agcatatttt gttctgccga tgaaaaatgc agagggcact 420 aaagttggtt cagtgaaagt gaatgcatct tatgccggtg tgttcgggaa aggtggggtt 480 acttctgcgg acggggagct gttttcgctt tttgccgacg ggttgcgcgc tatcttttat 540 ggtggtttga cgacgactgt ttcgggtgct gcactcacga gtgggagtgc cgcagcggcg 600 cgcacagagt tgtttggaag tctatcaaga aatgatattc tcggacagat tcaaagagta 660 aacgcaaata ttacttctct tgttgacgtc gcaggttctt acagggaaga catggagtac 720 actgatggaa ctgttgtttc tgctgcctat gcactgggta ttgcaaacgg tcagactatt 780 gaggcaactt ttaatcaggc tgtaactacc agcactcagt ggagcgctcc gctgaacgta 840 gcaattactt attactaa 858 39 431 DNA Escherichia coli 39 gagggacttt catcttttag caatactaca aatgaaattg ttaaacggaa gttgaatatt 60 tctgttccaa cggatgaatt atttttagca gcgaagatga gtgatgggat taaaggtgtt 120 ttcgtaggga atacactcat tcctaagatt gaaatggcat cttatgatgg tagtgttatt 180 acacctagtt tcacttcaaa tacagcaatg gatattgctg taaaagtaaa aaactcaggt 240 gataatactg agctagggac tctttctgtt cctttgtcat ttggtgcggc agttgcaact 300 atttttgatg gcgatactac tgatagcgct gtagcgcata ttatcggtgg ttctgctggt 360 acagtatttg aagggcttgt taatccaggt cgatttactg atcagaatat agcctataaa 420 tggaatggac t 431 40 450 DNA Escherichia coli 40 tgcgactacc aatgcttctg cgaatacagg tactattaac ttcaatggca aaataacgag 60 tgctacttgt acaattgacc ctgaggtcaa tggtaatcgt acatcaacta tagatcttgg 120 gcaggctgct attagtggtc atggcactgt agtggatttt aaactaaaac cagcgcccgg 180 cagtaatgac tgcctagcga aaacaaatgc tcgtattgac tggtctggtt ctatgaacag 240 tttaggtttt aataatacag cttcaggaaa tactgctgct aaaggatacc atatgacttt 300 gcgcgcaaca aacgttggaa atgggtctgg tggtgctaat attaatactt cattcactac 360 ggctgaatac actcacactt ctgcaattca gtcatttaac tattcagccc agctgaaaaa 420 agatgaccgc gctccgtcta atggtggata 450 41 954 DNA Escherichia coli 41 aaatttagaa aagtgcatta tgcttatcac tagataagaa aataaaacac gaaatatagc 60 gagccatata gcctgttgtg tttgtaatag ataaaaaaca cgcaattgat tatttatgta 120 tctttttgtt tgtatttttt tattaaaaaa agcacacaat tactgcgtgc atcgaaatga 180 gttgaagtgg atgcatatat gcatgaaatg cttttaactt gaaagtctta atgtttctat 240 taattaagat aaggtaatat gagaatgaaa aaatccgcat taacattagc agtgctttcc 300 tctctgttca gtggttactc gctcgcagcg cccgctgaaa acaacaccag ccaggcaaat 360 ttagacttta ctggtaaagt tactgccagt ctatgccaag tggatacttc taatctgtcg 420 caaaccatag atcttggaga gttgtctact tctgctctta aagctactgg caaggggcct 480 gccaagtcat ttgcagttaa tcttatcaac tgcgatacaa cattgaattc tattaaatac 540 actattgctg gtaataataa tacaggaagt gatactaaat atttagttcc agcctccaat 600 gatactagtg catcaggagt tggcgtatac attcaggaca acaacgccca ggctgtggaa 660 attggtactg aaaaaactgt acctgtggta tcaaatggcg gattagctct ttcagaccaa 720 agtattccac tgcaagcata catcggaacc accacaggga atcctgatac aaacggtgga 780 gttacggccg gtactgtcac tgctagtgca gtaatgacta ttcgttcagc aggtacaccg 840 taattagata acaattttta tacaacaaaa caggaaggat tttgaactaa tccttcctgt 900 tattggagat tgaaatgtct aagtttgtaa tatttcttgt gtttttgttt atat 954 42 331 DNA Escherichia coli 42 gttgatcaaa ccgttcagtt aggacaggtt cgtaccgcca ctttgaagca ggctggagca 60 accagctctg ctgtcggttt taacattcag ctgaatgatt gcgataccac tgttgccaca 120 aaagccgctg ttgccttctt ggggacggcg attgacagta ctcatcctaa agtcctggct 180 ctacagagtt cagctgcggg tagcgcaaca aacgttggcg tgcagattct ggacagaaca 240 ggtaatgagc tgacgctgga cggtgcgaca tttagtgcag aaacaaccct gaataacggt 300 actaacacca ttccgttcca ggcgcgttat t 331 43 506 DNA Escherichia coli 43 tcgagaacgg ataagccgtg gccggtggcg ctttatttga cgcctgtgag cagtgcgggc 60 ggggtggcga ttaaagctgg ctcattaatt gccgtgctta ttttgcgaca gaccaacaac 120 tataacagcg atgatttcca gtttgtgtag aatatttacg ccaataatga tgtggtggtg 180 cctactggcg gctgcgatgt ttctgctcgt gatgtcaccg ttactctgcc ggactaccct 240 ggttcagtgc caattcctct taccgtttat tgtgcgaaaa gccaaaacct ggggtattac 300 ctctccggca caaccgcaga tgcgggcaac tcgattttca ccaataccgc gtcgttttca 360 cctgcacagg gcgtcggcgt acagttgacg cgcaacggta cgattattcc agcgaataac 420 acggtatcgt taggagcagt agggacttcg gcggtgagtc tgggattaac ggcaaattat 480 gcacgtaccg gagggcaggt gactgc 506 44 625 DNA Escherichia coli 44 gcgctgtcga gttctatcga gcgtctgtct tctggcttgc gtattaacag cgcgaaggat 60 gacgccgcag gtcaggcgat tgctaaccgt tttacttcta acattaaagg cctgactcag 120 gcggcccgta acgccaacga cggtatttct gttgcgcaga ccaccgaagg cgcgctgtcc 180 gaaatcaaca acaacttaca gcgtattcgt gaactgacgg ttcaggccac tacagggact 240 aactccgatt ctgacctgga ctccatccag gacgaaatca aatctcgtct tgatgaaatt 300 gaccgcgtat ccggccagac ccagttcaac ggcgtgaacg tgctggcgaa agacggttca 360 atgaaaattc aggttggtgc gaatgacggc gaaaccatca cgatcgacct gaaaaaaatc 420 gattctgata ctctgggtct gaatggcttt aacgtaaatg gtaaaggtac tattaccaac 480 aaagctgcaa cggtaagtga tttaacttct gctggcgcga agttaaacac cacgacaggt 540 ctttatgatc tgaaaaccga aaataccttg ttaactaccg atgctgcatt cgataaatta 600 gggaatggcg ataaagtcac agttg 625 45 359 DNA Escherichia coli 45 cagcacaggc agtggatacg acgattactg ttacagggag ggtattgcca cgtacctgta 60 ccattggtaa tggaggaaac ccaaacgcca ccgttgtttt ggataacgct tacacttctg 120 acctgatagc agccaacagc acctctcagt ggaaaaattt ttcgttgaca ttgacgaatt 180 gtcagaatgt aaacaatgtt actagctttg gtggaaccgc agaaaataca aattattaca 240 gaaatacagg ggatgctact aatatcatgg ttgagctaca ggaacaaggt aatggtaata 300 cccccttgaa agttggttca acaaaagttg ttacagtgag caatgggcag gcgacattc 359 46 207 DNA Escherichia coli 46 ggcggcgtgc gcttctcgca tgataaatcc agtacacaat atcacggcag catgctcggc 60 aacccgtttg gcgaccaggg taagagcaat gacgatcagg tgctcgggca gctatccgca 120 ggctatatgc tgaccgatga ctggagagtg tatacccgtg tagcccaggg atataaacct 180 tccgggtaca acatcgtgcc tactgcg 207 47 500 DNA Escherichia coli 47 tgttgaaaga tcagtcctca ttacccagca acattgggat acgctgatag gtgagttagc 60 tggtgtcacc agaaatggag acaaaacact cagtggtaaa agttatattg actattatga 120 agaaggaaaa cgtctggaga aaaaaccgga tgaattccag aagcaagtct ttgacccatt 180 gaaaggaaat attgaccttt ctgacagcaa atcttctacg ttattgaaat ttgttacgcc 240 attgttaact cccggtgagg aaattcgtga aaggaggcag tccggaaaat atgaatatat 300 taccgagtta ttagtcaagg gtgttgataa atggacggtg aagggggttc aggacaaggg 360 gtctgtatat gattactcta acctgattca gcatgcatca gtcggtaata accagtatcg 420 ggaaattcgt attgagtcac acctgggaga cggggatgat aaggtctttt tatctgccgg 480 ctcagccaat atctacgcag 500 48 556 DNA Escherichia coli 48 aggttcttgg gcatgtatcc tggctctggg ccagttcccc attacacaga aactggccag 60 tctctttgtt tgcaataaat gtattacctg caatacgggc taaccaatat gctttattaa 120 cccgggataa ttaccctgtt gcatattgta gttgggctaa tttaagttta gaaaatgaaa 180 ttaaatatct taatgatgtt acttcattag tcgcagaaga ctggacttct ggtgatcgta 240 aatggttcat tgtctggatt gctcctttcg gggataacgg tgccctgtac aaatatatgc 300 gaaaaaaatt ccctgatgaa ctattcagag ccatcagggt ggatcccaaa actcatgttg 360 gtaaagtatc agaatttcac ggaggtaaaa ttgataaaca gttagcgaat aaaattttta 420 aacaatatca ccacgagtta ataactgaag taaaaaacaa gtcagatttc aatttttcat 480 taacaggtta agaggtaatt aaatgccaac aataaccgct gcacaaatta aaagcacact 540 gcagtctgca aagcaa 556 49 170 DNA Escherichia coli 49 aggcaggtgt gcgccgcgta ctacacatta ccgccgttga tgttatcaag cagggcaata 60 atttactcgg cgtaataaca gagagtaaat ctggtcgtca ggctattttg gcaaatgtca 120 ttattgactg tactggtgat gctgatattg catggtttgc cggagcacca 170 50 827 DNA Escherichia coli 50 ctggcggagg ctctgagatc agtagagggt gtggatgttg aaagtggtac gggtaaaacc 60 ggagggctgg aaatcagcat ccgaggaatg ccagccagtt acacgctgat actgattgat 120 ggtgttcgtc agggcggaag cagtgacgtg actcccaacg gtttttctgc catgaatacc 180 gggttcatgc cccctctggc cgccattgag cgtattgagg ttatcagggg gccgatgtcc 240 acactgtatg gctctgatgc gatgggcggt gtggtgaata tcattaccag aaagaatgca 300 gacaaatggc tctcttccgt caatgcaggg ctgaatctgc aggaaagcaa caaatggggt 360 aacagcagcc agtttaattt ctggagcagt ggtccccttg tggatgattc tgtcagcctg 420 caggtacgcg gtagcacaca acagcgtcag ggttcatcgg tcacatcact gagcgataca 480 gcaggcacgc gtattcctta tcccacggag tcacagaatt ataatcttgg tgcacgtctt 540 gactggaagg cgtcggagca ggatgtgctc tggtttgata tggataccac ccggcagcgt 600 tatgataacc gggatgggca actggggagt ctgacggggg gatatgaccg gaccctgcgc 660 tatgagcgaa acaaaatttc agctggctat gatcatactt tcaccttcgg aacatggaaa 720 tcgtatctga actggaacga gacagaaaat aaaggtcgtg agcttgtacg cagtgtactg 780 aagcgcgaca aatgggggct tgccggtcag ccgcgggagc ttaagga 827 51 258 DNA Escherichia coli 51 tctgatatag tttatatggg taataaggct ctttatttaa tccttatctt ttccttatgg 60 ccagtaggta tagctacggt tattggatta actattggtt tattacagac agtgactcaa 120 cttcaagagc agacacttcc ttttggtata aagcttatag gtgtctcaat atctttgcta 180 cttctttctg gatggtatgg tgaggtttta ttgtcttttt gtcatgaaat aatgttttta 240 attaagagtg gggtttga 258 52 500 DNA Escherichia coli 52 ttgcaaaagc aattttgcaa caaactactg cttgatacaa ataaggagaa tgttatggaa 60 attcaaaaca caaaatcaac ccagatttta tatacagata tatccacaaa acaaactcaa 120 agttcttccg aaacacaaaa atcacaaaat tatcagcaga ttgcagcgca tattccactt 180 aatgtcggta aaaatcccgt attaacaacc acattaaatg atgatcaact tttaaagtta 240 tcagagcagg ttcagcatga ttcagaaatc attgctcgcc ttactgacaa aaagatgaaa 300 gatctttcag agatgagtca cacccttact ccagagaaca ctctggatat ttccagtctt 360 tcttctaatg ctgtttcttt aattattagt gtagccgttc tactttctgc tctccgcact 420 gcagaaacta aattgggctc tcaattgtca ttgattgcgt tcgatgctac aaaatcagct 480 gcagagaaca ttgttcggca 500 53 668 DNA Escherichia coli 53 aagtcaaagc aggggttgcc cgaaccttta aagccccaaa cctgtatcaa tccagtgaag 60 gctatctgct ctactcgaaa ggcaatggct gtccaaaaga tattacatca ggcgggtgct 120 acctgatcgg taataaagat ctcgatccgg aaatcagcgt caataaagaa attggactgg 180 agttcacctg ggaagattac cacgcaagtg tgacctactt ccgcaatgat taccagaata 240 agatcgtggc cggggataac gttatcgggc aaaccgcttc aggcgcatat atcctcaagt 300 ggcagaatgg cgggaaagct ctggtggacg gtatcgaagc cagtatgtct ttcccactgg 360 tgaaagagcg tctgaactgg aataccaatg ccacatggat gatcacttcg gagcaaaaag 420 acaccggtaa tcctctgtcg gtcatcccga aatatactat caataactcg cttaactgga 480 ccatcaccca ggcgttttct gccagcttca actggacgtt atatggcaga caaaaaccgc 540 gtactcatgc ggaaacccgc agtgaagata ctggcggtct gtcaggtaaa gagctgggcg 600 cttattcact ggtggggacg aacttcaatt acgatattaa taaaaatctg cgtcttaatg 660 tcggcgtc 668 54 1689 DNA Escherichia coli 54 gcgatgttta accccgattc ggcgcagctg gacaatatgg cctgggcgca gccggcgatt 60 gtcgcgtttg aaatcgcgat ggcggcgcac tggcacgctg aaggactgaa gccagacttc 120 gccattgggc attccgtcgg tgaatttgcc gctgccgtcg tctgcggaca ctatacgatt 180 gaacaggtca tgccactggt ttgtcgacgc ggcgcactga tgcagcagtg cgcaagcggc 240 gcaatggtgg cggtatttgc agacgaagac acgctgatgc cgctggctcg ccagtttgag 300 ctggatctcg ccgccaacaa cggtacgcaa catacggtat tttccgggcc ggaagcccgt 360 ctcgcggtat tttgcaccac gctctcgcag cataacatta actatcgtcg cctgagcgta 420 accggcgcgg cgcactccgc tttactggaa ccgatactcg atcggttcca ggacgcctgc 480 gcggggctgc acgcggagcc ggggcaaata ccgattattt ccacgctcac cgccgacgtc 540 attgatgagt caacgctcaa ccaggcggat tactggcgcc gacacatgcg ccagccggtg 600 cgttttatcc agagtattca gatggcgcat cagctcggcg cccgcgtttt tctggagatg 660 gggcccgatg cccagttggt tgcttccggg cagcgcgaat accgcgataa cgcatactgg 720 atagccagcg cccggcgtaa caaagaggcg agcgatgtcc tcaatcaggc cctgctccag 780 ctttacgctg ccggtgtcgc cttaccgtgg accgacctac tggcgggtga tggacaacgt 840 atcgctgcgc catgttatcc gtttgatact gagcgttact ggaaagagcg cgtctccccg 900 gcctgcgaac ctgccgacgc agcgctgtct gccgggctgg aggtggcgag tcgcgccgcg 960 acagcgctcg atctcccccg tctggaagcg cttaaacagt gcgccacgcg actgcacgcc 1020 atctacgtcg atcaactggt acaacgctgt accggcgatg ccattgaaaa cggcgtggac 1080 gccataacca tcatacgccg tggacgtctg ctgccccgct accagcagct actccagcgc 1140 ctgctgaata actgcgtggt cgacggcgat taccgctgca ccgacgggcg atacgtccgc 1200 gcccacccca ttgaacatca acagcgggaa tcactgctga cggaacttgc cggttattgt 1260 gaaggttttc aggctattcc cgacaccatc gcccgtgccg gcgatcggtt atatgacatg 1320 atgagcggcg cggaagaacc ggtggcgatt atcttcccgc aaagcgcctc cgacggcgtg 1380 gaagtgctgt atcaggaatt cagctttggc cgctatttca accaaatcgc cgccggggta 1440 ttacgcggca ttgtccagac gcgtcagccc cgccagtcgt tgcgtattct tgaagttggc 1500 ggcggaaccg gcggcaccac cgcgtggctg ctgccggaac tcaacggcgt tccggcgctg 1560 gagtaccact tcaccgatat ctcagcgctg ttcacccgcc gcgcccagca gaaattcgcc 1620 gactatgatt ttgtgaagta tagcgagctg gatctcgaaa aagaggcgca gtctcagggt 1680 ttccaggca 1689 55 1241 DNA Escherichia coli 55 gccggaaagc ctggccttta accatccggc cagcgccccg tatattcagg aactggcgac 60 aatttgccaa cagcttgcac agcgcttaca gcgcccggta cgcctgcttg aggtgggaac 120 ccgcaccggc cgcgccgcag aatcgctgtt ggcacagctc aacgccggac agattgagta 180 tgtcgggctt gagcagagcc aggagatgct actgagcgcc cggcagaggc tcgcctcctg 240 gcctggtgcc cgtctgtccc cctggaatgc agacacgctg gcggcgcacg ctcactcggg 300 ggacattatc tggcttaata acgccctgca tcgtctgctg ccggaagatc ccgggctcct 360 tgcgacatta caacagcttg ccgttcccgg cgcgctgctc tacgtgatgg agtttcgcca 420 gttaacgccg tccgccctgc tcagcacgct cctgttaacc aatgggcagc cggaggcctt 480 gctgcataac agcgccgact gggcggcatt atttagcgcg gccgccttca actgtcagca 540 tagcgatgag gtcgcggggt tacaacgctt cctcgtacaa tgtcctgaca ggcaggtgcg 600 ccgcgatccc cgtcaacttc aggccgccct cgccgggcgt ctgccggggt ggatggtgcc 660 gcaacggatc gtcttcctcg acgccttacc gctgacggct aacgggaaaa ttgactacca 720 ggcgctgaag cgtcgtcata cccctaaagc ggaaaaccag gccgaagcgg atttacccca 780 gggcgacatt gaaaaacagg ttgccgccct ctggcagcaa ctcttatcga ctggcaatgt 840 caccagagaa accgacttct tccagcaagg cggcgatagc ctgctggcga cccgtctgac 900 cgggcaactt catcaggcag gttatgaagc gcaattaagc gacctgttta atcatccccg 960 gctggcggat tttgccgcca cgctgcgtaa aatcgacgtc ccggtcgaac aaccattcgt 1020 ccactctcct gaagaacgct accagccctt tgcgcttacc gacgtgcagc aggcttacct 1080 ggtggggcgt cagccgggct ttaccctggg cggcgtcggc tcacatttct ttgttgaatt 1140 tgaaattgcc gatctggacc tcacccggct ggagacggtc tggaaccgat taatcgcccg 1200 ccacgatatg ctacgcgccg tcgtgcttga tggacagcaa c 1241 56 607 DNA Escherichia coli 56 tcacatagga ttctgccgtt tttaacaatg caggataata agatgaaaaa aatgttattt 60 tctgccgctc tggcaatgct tattacagga tgtgctcaac aaacgtttac tgttggaaac 120 aaaccgacag cagtaacacc aaaggaaacc atcactcatc atttcttcgt ttccccaatt 180 ggacagagaa aactgttgat gcagccaaaa tttgttggcg gtgcagaaaa tgttgttaaa 240 acagaaactc agcaaacatt cgtaaatgca ttgcccggtt ttatcacttt tggcatctat 300 actccgcggg aaacccgtgt atattgctca caataggccc atcgatatgg ggagctcatc 360 tgcactgttc attataactt ctgggctccc tacagttgtt tttgcatagt gataagcctc 420 tctctgaggg aggaaataat cctgttcagc gatgtctacc agtcgggggg gctgcattat 480 ccaccccgag gcggtggtgg cttcacgcgg ggatgggcag attgatctga tatgcaaccg 540 acgacgacca gcggcaacat catcacgcag agcttcattt tcagatttgg gccacctttt 600 gatttct 607 57 778 DNA Escherichia coli 57 aagtgtcgat tttattggtg tagggacagg gccatttaat ctcagcattg ctgcgttgtc 60 acatcagatc gaagaactgg actgtctctt ctttgatgaa catcctcatt tttcctggca 120 tccgggtatg ctggtaccgg attgtcatat gcagaccgtc tttctgaaag atctggtcag 180 tgctgttgca cctacaaatc cctacagttt tgttaactat ctggtgaagc acaaaaagtt 240 ctatcgcttc cttacaagca gactacgtac agtatcccgt gaagagtttt ctgactacct 300 ccgctgggct gctgaagata tgaataacct gtatttcagt cataccgttg aaaacattga 360 tttcgataaa aaacgtcgat tgtttctggt gcaaaccagc cagggacaat attttgcccg 420 caatatctgc cttggtacag gaaaacaacc ttatttacca ccctgtgtga agcatatgac 480 acaatcctgt ttccatgcca gtgaaagtaa tcttcgtcgg ccggatctta gtggaaaacg 540 gataaccgtg gttggtggag gacagagtgg tgcagacctg ttccttaatg cattacgcgg 600 ggaatgggga gaagcggcgg aaataaactg ggtgtcccgg cgtaataatt ttaacgcact 660 ggatgaggct gcttttgctg atgattattt tacacctgaa tatatttcag gcttctccgg 720 actggaggaa gatattcgcc atcagttact ggatgagcag aaaactgaca tcggatgg 778 58 302 DNA Escherichia coli 58 ggctggacat catgggaact ggtacgctga acatcgatga atcccggcag cttcagttga 60 tcacacagta ctataaaagc cagggcgacg acgattacgg gcttaatctc gggaaaggct 120 tctctgccat cagagggacc agcacgccat tcgtcagtaa cgggctgaat tccgaccgta 180 ttcccggcac tgacgggcat ttgatcagcc tgcagtactc tgacagcgct tttctgggac 240 aggagctggt cggtcaggtt tactaccgcg atgagtcgtt gcgattctac ccgttcccga 300 cg 302 59 2126 DNA Escherichia coli 59 cttcctgttc tgattcttct ggcgctatcg gggagctttt ctaccgctgt agccgctgat 60 aaaaaagaga ctcaaaattt ctactatcca gaaacactgg atttaactcc tctgagatta 120 cacagccctg aatcaaatcc ctggggggct gattttgatt atgccaccag atttcaacag 180 ctggatatgg aggctctgaa aaaagatatc aaagatttgc tgacaacttc ccaggattgg 240 tggcctgcgg attatggtca ttatggtcct ttctttattc gtatggcttg gcacggtgcc 300 ggaacataca ggacatatga tggccgggga ggcgccagtg gtggtcagca acgttttgaa 360 ccgctgaaca gctggccgga taacgttaat ctggataaag cccgtcgatt gctgtggcca 420 gtcaagaaaa aatacggctc cagtatttcc tggggagacc tgatggtcct gactggtaat 480 gttgcccttg aatccatggg atttaaaacg ctgggatttg ctggcggaag agaagatgac 540 tgggagtcgg acctggtata ctgggggcct gacaacaagc ctcttgcaga taaccgggat 600 aaaaacggga aacttcagaa acctcttgcc gccacgcaga tgggacttat ttatgtcaat 660 cctgaaggcc ccggtggaaa accagatcct ctggcttccg cgaaagatat cagggaagct 720 ttttcacgta tggccatgga tgatgaggag actgtggccc tgatcgcggg agggcataca 780 tttggtaaag cacatggtgc agcgtctcct gaaaaatgta ttggcgcagg gcctgatggt 840 gcacctgtgg aggagcaggg actgggatgg aaaaataaat gtggtacagg aaacggcaaa 900 tataccatca ccagtggcct ggaaggagcc tggtcgacat cgccaaccca gttcacaatg 960 cagtatctga agaatttata taaatatgaa tgggagctgc acaagagtcc tgccggtgct 1020 tatcagtgga agcctaaaaa agcggcaaat atagttcagg acgcgcatga tccgtctgtc 1080 ctgcatccgt tgatgatgtt tacgacggat attgctctta aagttgatcc tgaatataag 1140 aaaataacca cccgtttcct gaatgatcca aaagcttttg agcaggcatt cgcaagagca 1200 tggtttaaac tgacccaccg ggatatggga ccggcagccc gatatcttgg taatgaagtt 1260 cctgcagaat catttatctg gcaggatcct cttcctgcgg cggattatac aatgattgat 1320 ggtaaagaca ttaagtcgct gaaagagcag gttatggatt tgggtatccc tgcatctgag 1380 ctgataaaga cagcctgggc ttcagcttcc acatttcgtg tgactgatta tcgtggggga 1440 aataatggtg cccgcatcag gttacagccc gaaattaact gggaagttaa tgagcctgaa 1500 aaactgaaga aagtactggc atccctgacc tcattacagc gtgaatttaa caaaaaacag 1560 tctgacggaa agaaagtgtc gttggctgat ttaattgttc tttcgggtaa tgctgcaatc 1620 gaagatgcgg ccagaaaagc cggggtggaa cttgagattc cctttactcc gggaagaact 1680 gacgcctctc aggagcagac ggatgttgcc tcattcagtg tactggagcc gacagcagat 1740 ggattcagaa attattactc aaaaagcaga agtcatatat cgccggttga aagcctcatt 1800 gataaagcca gtcagctgga tctcaccgtt cctgaaatga cggcattact gggtggtctg 1860 cgggtaatgg atattaatac aaataattct tcgttgggag tgtttaccga tacccctggt 1920 gttctggata acaagttttt tgttaatctg ctggatatgt caacacgatg gagtaaagca 1980 gataaagaag atacatacaa tggattcgat cgtaaaacgg gagcattaaa atggaaagca 2040 tcctctgttg atttaatctt cagttcaaat cctgaattac gtgcggtggc agaagtatat 2100 gcctcggatg atgcgagaaa taagtt 2126 60 501 DNA Escherichia coli 60 aattgtttta aaatctgttc tttttctgat attgcctgag tgagttttga ttctttttcg 60 cttatctctt cttcatattt ctgtatgtca tttatttttt tttcaatatc atctggatat 120 ttttcttgtc tcacaacatc gttgtgattt atttttgaaa gtttaattac ctcttccttt 180 tcttttttta gctcattaat cctgttcaat aataaaaatt gtcttttctt aaatatagat 240 ttagtcaatt caatatctgc atccttcttt tttgactctt gcaccaaatg agatacaata 300 ttgaagatag aatttctctc ttcgactacc tcccataatg cagcctcagc cgaaacgtta 360 tctttcgatg tttctatata aggtaggttt gcataggctt gtaactcatt ataaagcttc 420 agaacctcgg ttctttcttt tattaattca gatactaagt actcactaac ctttgtatca 480 ttaaatgtaa tttcagtctc a 501 61 270 DNA Escherichia coli 61 gcgcatttgc tgatactgtt gggcattttt ggttacatta tgcaccgcac gatgccagac 60 atctcattcc cggtgttttt acttaatggc ctgattccct tttttatctt tagcagtatc 120 agcaatcgtt ctgtaggcgc tattgaagcg aaccaggggt tgtttaatta tcgaccagta 180 aaacccatcg atacgatcat tgcacgcgca ctgcttgaga cgctgattta cgttgctgtt 240 tatatattgc tcatgcttat cgtctggatg 270 62 390 DNA Escherichia coli 62 tcctcttgct actattcccc ctcaatatca gcattggttt ttatggaatc cacttgtgca 60 tgctgtagaa ctaatccgaa gggcatggat atctggttat cgtagtcctg atgtaagttg 120 ggcgtatctg tcggttgtca ccttattatt gctcactttt gctatgagtt gttaccgatt 180 acggcatcgc caattgattg ctagttagcg ttaagaaaaa tgattattct tgataatgta 240 tcaaaatatt atccgactaa atttggacga aattatgtcc tgaggaatgt aaatattgag 300 ctaccaaggg accgtaatat aggtattcta ggtatcaatg gagcaggaaa atctactttg 360 ttacgtttgt taggagggat ggatacgcct 390 63 659 DNA Escherichia coli 63 cagttcgctc gtaaagcaga aaaatgcgac agaagatgtt gttttaatag gcaaaatgat 60 tttagatgaa gttagaagtt acagaactat acataatgat cgaaatatcg taagtaactc 120 aggaaactgg aaaacatctt ttttatgtaa tcttgctaga ctactatata gcatatttaa 180 tggtagtaac tatttttgtt cccgagaggg tgaaaataat tcatccccca gttctactct 240 acttactata catcagcctg aaaagcagga actattacaa caaaagagta tcaaacattt 300 accaacaagt aataacatcg acggatacat taaaataaga aaaacaagag gcgctgaaga 360 tcaaacaaca actatcactc aaagtttgat aattaatgag ttgttaaatg gagttgatag 420 aaataccatc ccttttcaga aaataagtga gctcaatgat atcatacatt catatgaaaa 480 tatgcaaatt aaaaatagtc gaaaaggtat agaaatactt gttaagcagg gagagctgtt 540 atcatcatta ataaatgata ataaaggaaa taaacaatta tcagacaatg catctaaaat 600 aataaactta ttgggtatag agtatcagtc acataaagta gacatagagc catttatac 659 64 501 DNA Escherichia coli 64 gaacaattca aacagttcag tattgaaaaa caggctgcga ttaactcgct attacagttg 60 cgcggaatgt tagaaatgct gggagagatg gggataaaca tcagcgacga tttacaaaaa 120 gtcacttctg caattaatgc catcgaatct gatgtcctgc gtattgctct gttgggggcg 180 ttctccgatg gcaaaaccag cgttatcgcc gcatggctgg gtaaagtaat ggatgatatg 240 aatatttcca tggatgagtc ctccgatcgg ttgagtattt acaaaccgga aggtctgcca 300 gatcagtgtg aaattgttga tacgcccggg ctgtttggtg ataaagagcg cgaggtggac 360 gggagactgg tgatgtatga agacctgacc agacgctata tatctgaagc acacctgatt 420 ttttacgtgg ttgatgccac gaacccgctc aaggagagcc acagcgacat cgtaaaatgg 480 gtattgcgcg atttgaataa a 501 65 424 DNA Escherichia coli 65 caaatacagt ccgcgtacga atgaaagatg cttatcaacg tgatggtaaa tatccagatt 60 ttgtggaccc attaagcctt actgcaaata caattaaaac tgatacaagc ggaatacctg 120 cagcacagtt agttcagctt gggaaaatta caccagacga agtgcgtaat aacatttctg 180 gcgactttat cgctattggc ggtgctttaa cttcgaatgg tgctcaagtt aaaaaaggtt 240 ttgctatcga acttaatgga ttaagccaag agcagtgccg ttctattctt gggcaagttg 300 ggaataactg ggaatatgtt gctattggta cttctgcgtc tggttcatat gccatgacag 360 caactggtgt agatatgtct gtggccgcct ctacaactgt tttacgctct ttaggtaaca 420 atgg 424 66 275 DNA Escherichia coli 66 ttacggcgtt actatcctct ctatgtgcat acggagctcc ccagtctatt acagaactat 60 gttcggaata tcgcaacaca caaatatata cgataaatga caagatacta tcatatacgg 120 aatcgatggc aggcaaaaga gaaatggtta tcattacatt taagagcggc gcaacatttc 180 aggtcgaagt cccgggcagt caacatatag actcccaaaa aaaagccatt gaaaggatga 240 aggacacatt aagaatcaca tatctgaccg agacc 275 67 500 DNA Escherichia coli 67 ttggcagtta caggaatgca ttgtgataat gcgtatggaa atacaataca tattatagaa 60 caagataatt ttaatattat caaggttgtg gatataaata tcaatacaac ttcacatact 120 cacattctcc attcaatgag tgtttgcctc aattcgtttg gtgatttttt ttcaaataac 180 acatatgatg cggttatggt tttaggcgat agatatgaaa tattttcagt cgctatcgca 240 gcatcaatgc ataatattcc attaattcat attcatggtg gtgaaaagac attagctaat 300 tatgatgagt ttattaggca ttcaattact aaaatgagta aactccatct tacttctaca 360 gaagagtata aaaaacgagt aattcaacta ggtgaaaagc ctggtagtgt gtttaatatt 420 ggttctcttg gtgcagaaaa tgctctttca ttgcatttac caaataagca ggagttggaa 480 ctaaaatatg gttcactgtt 500 68 537 DNA Escherichia coli 68 gcttactgat tctgggatgg attaacagaa cacaactggc ttgtccataa gcaaaatgaa 60 ggcaaaaaaa tatgaaaatc aaatatacaa tgaaaatggc cgccgttgcc agcgtcatgg 120 tcgccggcta gctatcgcgg atgcaaatgg gctcaacact gtgaacgccg gggatggcaa 180 gaatctgggc accgcaaccg cgacgatcac cactctgcag agctgctctg tcgacctgaa 240 tctcgttacc ccgaacgcga cagtgaacag agcaggaatg ctagcaaacc gcgaaatcac 300 taaattttcg gtggggagta aggattgccc tagcgacacc tatgctgtat ggtttaaaga 360 gatcgatggc gaaggacagg gggtcgcgca gggcactacg gtgaccaaca agttttacct 420 taaaatgaca tcggccgacg ggaccgcgag cgtaggggac atcaacatag gaaccaaatc 480 aggcaaaggc ctgagtggtc aactggtagg gggaaaattc gacggaaaaa taacggt 537 69 1422 DNA Escherichia coli 69 cttgcggagg cttgtctgag cggtttccgc gattctcttc tgtaaattgt cgctgacaaa 60 aaagattaaa cataccttat acaagacttt tttttcatat gcctgacgga gttcacactt 120 gtaagttttc aactacgttg tagactttac atcgccaagg gtgctcggca taagccgaag 180 atatcggtag agttaatatt gagcagatcc cccggtgaag gatttaaccg tgttatctcg 240 ttggagatat tcatggcgta ttttggatga taacgaggcg caaaaaatga aaaagacagc 300 tatcgcgatt gcagtggcac tggctggttt cgctaccgta gcgcaggccg ctccgaaaga 360 taacacctgg tacactggtg ctaaactggg ctggtcccag taccatgata ctggtttcat 420 caacaacaat ggcccgaccc atgaaaacca actgggcgct ggtgcttttg gtggttacca 480 ggttaacccg tatgttggct ttgaaatggg ttacgactgg ttaggtcgta tgccgtacaa 540 aggcagcgtt gaaaacggtg catacaaagc tcagggcgtt caactgaccg ctaaactggg 600 ttacccaatc actgacgacc tggacatcta cactcgtctg ggtggcatgg tatggcgtgc 660 agacactaaa tccaacgttt atggtaaaaa ccacgacacc ggcgtttctc cggtcttcgc 720 tggcggtgtt gagtacgcga tcactcctga aatcgctacc cgtctggaat accagtggac 780 gaacaacatc ggtgacgcac acaccatcgg cactcgtccg gacaacggca tgctgagcct 840 gggtgtttcc taccgtttcg gtcagggcga ggcagctcca gtagttgctc cggctccagc 900 tccggcaccg gaagtacaga ccaagcactt cactctgaag tctgacgttc tgttcaactt 960 caacaaagca accctgaaac cggaaggtca ggctgctctg gatcagctgt acagccagct 1020 gagcaacttg gatccgaaag acggttccgt agttgttctg ggttacaccg accgcatcgg 1080 ttctgacgct tacaaccagg gtctgtccga gcgccgtgct cagtctgttg ttgattacct 1140 gatctccaaa ggtatcccgg cagacaagat ctccgcacgt ggtatgggcg aatccaaccc 1200 ggttactggc aacacctgtg acaacgtgaa acagcgtgct gcactgatcg actgcctggc 1260 tccggatcgt cgcgtagaga tcgaagttaa aggtatcaaa gacgttgtaa ctcagccgca 1320 ggcttaagtt ctcgtctggt agaaaaacgc tgctgcgggt ttttttttgc ctttagtaaa 1380 ttgaactgac tttcgtcagt tattccttac ccagcaatgc ct 1422 70 559 DNA Escherichia coli 70 atctagccga agaaggaggc cgaaaagtca gtcaactcga ctggaaattc aataacgctg 60 caattattaa aggtgcaatt aattgggatt tgatgcccca gatatctatc ggggctgctg 120 gctggacaac tctcggcagc cgaggtggca atatggtcga tcaggactgg atggattcca 180 gtaaccccgg aacctggacg gatgaaagta gacaccctga tacacaactc aattatgcca 240 acgaatttga tctgaatatc aaaggctggc tcctcaacga acccaattac cgcctgggac 300 tcatggccgg atatcaggaa agccgttata gctttacagc cagaggtggt tcctatatct 360 acagttctga ggagggattc agagatgata tcggctcctt cccgaatgga gaaagagcaa 420 tcggctacaa acaacgtttt aaaatgccct acattggctt gactggaagt tatcgttatg 480 aagattttga actcggtggc acatttaaat acagcggctg ggtggaatca tctgataacg 540 atgaacacta tgacccggg 559 71 360 DNA Escherichia coli 71 atgaggaaca taatggcagg ttttttaata ttcctgtctt ctgctgctta tgctgatatc 60 aatctgtatg gtcctggtgg cccgcataca gccttgcttg atgcagccaa actttatgcc 120 gaaaaaacag gtattatagt gaacgttcat tacggcccac agaacaaatg gaatgaagat 180 gccaaaaaaa atgcagatat cttgtttggc gcatcagaac aatctgctct ggctatcatt 240 cgggaccata aagacagctt cagtgaaaaa gatattcagc ctctttatct gcgaaaaagt 300 attttactgg taaagaaagg taatcctaaa aatatccgga gtattgacga cctgaccaga 360 72 721 DNA Escherichia coli 72 atggcagtgg tgtcttttgg tgtaaataat gctgctccaa ctattccaca ggggcagggt 60 aaagtaactt ttaacggaac tgttgttgat gctccatgca gcatttctca gaaatcagct 120 gatcagtcta ttgattttgg acagctttca aaaagcttcc ttgaggcagg aggtgtatcc 180 aaaccaatgg acttagatat tgaattggtt aattgtgata ttactgcctt taaaggtggt 240 aatggcgcca aaaaagggac tgttaagctg gcttttactg gcccgatagt taatggacat 300 tctgatgagc tagatacaaa tggtggtacg ggcacagcta tcgtagttca gggggcaggt 360 aaaaacgttg tcttcgatgg ctccgaaggt gatgctaata ccctgaaaga tggtgaaaac 420 gtgctgcatt atactgctgt tgttaagaag tcgtcagccg ttggtgccgc tgttactgaa 480 ggtgccttct cagcagttgc gaatttcaac ctgacttatc agtaatactg ataatccggt 540 cggtaaacag cggaaatatt ccgctgttta tttctcaggg tatttatcat gagactgcga 600 ttctctgttc cacttttctt ttttggctgt gtgtttgttc atggtgtttt tgccggtccg 660 tttcctccgc ccggcatgtc ccttcctgaa tactggggag aagagcacgt atggtgggac 720 g 721 73 318 DNA Escherichia coli 73 gacggctgta ctgcagggtg tggcggttgg attgtcagcc tcaaggtcta aatatctggg 60 gcgtgataac gattctgctt acctgcgtat atccgtgccg ctggggacgg ggacagcgag 120 ctacagtggc agtatgagta atgaccgtta tgtgaatatg gccggctaca ctgacacgtt 180 caatgacggt ctggacagct acagcctgaa cgccggcctt aacagtggcg gtggactgac 240 atcgcaacgt cagattaatg cctattacag tcatcgtagt ccgctggcaa atttgtccgc 300 gaatattgca tccctgca 318 74 336 DNA Escherichia coli 74 gcaacagcaa cgctggttgc atcatattcg taatagtatc aactaaaata cgttaatttt 60 atatctcgta aaataaaatg ttttctgtac cgctctccgg agggggaatg attcgtttat 120 cattatttat atcgttgctt ctgacatcgg tcgctgtact ggctgatgtg cagattaaca 180 tcagggggaa tgtttatatc cccccatgca ccattaataa cgggcagaat attgttgttg 240 attttgggaa tattaatcct gagcacgtgg acaactcacg tggtgaagtc acaaaaacca 300 taagcatatc ctgtccgtat aagagtggct ctctct 336 75 461 DNA Escherichia coli 75 tcgtgctcag gtccggaatt tgcgagtgga gtgtattttc aggagtatct ggcctggatg 60 gttgttccta aacatgtcta tactaatgag gggtttaata tatttcttga tgttcagagc 120 aaatatggtt ggtctatgga gaatgaaaat gacaaagatt tttacttctt tgttaatggt 180 tatgaatggg atacatggac aaataatggt gcccgtatat gtttctatcc tggaaatatg 240 aagcagttga acaataaatt taatgattta gtattcaggg ttcttttgcc agtagatctc 300 cccaagggac attataattt tcctgtgaga tatatacgtg gaatacagca ccattactat 360 gatctctggc aggatcatta taaaatgcct tacgatcaga ttaagcagct acctgccact 420 aatacattga tgttatcatt cgataatgtt gggggatgcc a 461 76 190 DNA Escherichia coli 76 gggatgagcg ggcctttgat gcaggtaatt tgtgtcagaa accaggagaa acaactcgtc 60 tgactgagaa atttgacgat attattttta aagtcgcctt acctgcagat cttcctttag 120 gggattattc tgttacaatt ccatacactt ccggcataca gcgtcatttc gcgagttact 180 tgggggcccg 190 77 268 DNA Escherichia coli 77 taagctatgt ggcctgcaat ggatttacct ggactcatgg tctttactgg tctgagtatt 60 ttgcatggct ggttgttcct aaacatgttt cctataatgg atataatata tatcttgaac 120 ttcagtccag aggaagtttt tcacttgatg cagaagataa tgataattac tatcttacca 180 agggatttgc atgggatgaa gcaaacacat ctggacagac atgtttcaat atcggagaaa 240 aaagaagtct ggcatggtca tttggtgg 268 78 922 DNA Escherichia coli 78 tcgccaccaa tcacagccga accgccgatt ggcgtaaagc ggaaaactga cgtcaccaga 60 tggtttaagc caaaaggaat cgtcacgcgt tcggcaactg catagaagaa ataaccaaca 120 ggaccggaag ttgaaatcca gtgtccaatg agcatgaaaa gattgaaaaa cggcggccag 180 ataaaaggaa tgatcagacc aaatccactc atcacaatca gtgtaatgat aggcaccaga 240 cgtgggccgc tataagaacc taacgattca ggaatgcgta aattaacgat ctttttatac 300 atgctggcga ctaataaccc agcaacaatt ccccccaaca cgctggtatt gtaggactgg 360 atccccagaa tgatggtttg cccatgtgtc gacatttggt cagcaacgac caataagtcg 420 tgctgtttaa gataaaagtt cgttcccaga tgcatcgcca taaaaccaat taagccagaa 480 aaagcaccat aggctttatc ctctttatct tttaataatc ctaagggaat cgctatcgca 540 aacaatacag gtaaattaac aaaggcaaac aaaccaagac taacaatgaa atcaagtatg 600 gttttaatta ttggaatagc cagaaatgga attaactttg ccatatcatc actggctaaa 660 ccacttccca gccctagcat catgccacat acacttagca gagcaatggg atacataaat 720 gccttcccca ggctctgaaa aaaactccag gctttctttt gtttcatgtg ggttatctca 780 tataaatgtt atatataatt agtccattaa tactttggta cgaatagaga gatatagttt 840 ttcttctaaa attaattcat atttaaaagt ggcatacaga taccgttcaa tttcatgaat 900 tgcgcgctgt aacaggatgt cc 922 79 501 DNA Escherichia coli 79 ggtgatcgat tattccgctg agcgtattca gtctttaaaa gacaaataca gcctgccgga 60 tgagtttatc ttgtcgctgg cgatgatcga gccgcggaaa aatattgaag cgcttattca 120 cgcctacagc ttgctgcctg ccgagctgca gcagcgctat ccgatggtgc tggcgtataa 180 agtgcagcca gaacaactgg agcggatcct gcgtctggcg gaaagctatg gtttgtcacg 240 cagccagctt atctttactg ggttcctgac cgacgacgat ctgattgccc tgtacaacct 300 gtgcaaactg tttgtgttcc cgtcgctgca tgaaggtttc ggcctgccgc cgctggaagc 360 gatgcgctgc ggggcggcga ccttaggttc aaacattacc agcctgccgg aagtcattgg 420 ctgggaagat gccatgttca atccgcatga tgtgcaggac attcgccggg tcatggagaa 480 ggcgctgacc gatgaggcgt t 501 80 500 DNA Escherichia coli 80 tctgcacgtt taaaattatt gcctgggtta aagtcaactg agtatgtgta ttcagatctt 60 catgctttac ttgatactaa tggtgggagt tccttaggtc cgaacattgg tagtgatggt 120 tctaacctaa caataacatg tttatcgatg agcagagttt ttcttactga aaaacttgtt 180 aattctatat atcagcatat accttatttt aaaggtgata ttctgattgt tgataatggc 240 agcacagtag aagaactttc aattttacaa gatttaagtg ataggatccc gttaaatatt 300 agagttgtcg agcttggtaa taattttggc gtaagtggtg gaagaaacaa aactttagag 360 catataaaaa cagaatgggc aatgtttctc gataatgata tttatttcat aaataatcca 420 cttccgagat tgcaaaatga tatttcaaga cttggttgtc attttatcaa tatgccattg 480 cttgattctg acggagaaac 500 81 406 DNA Escherichia coli 81 tagagaaatt atcaagttag ttccattagt atcaattgat ctgctaattg aaaacgagaa 60 tggtgaatat ttatttggtc ttaggaataa tcgaccggcc aaaaattatt tttttgttcc 120 aggtggtagg attcgcaaaa atgaatctat taaaaatgct tttaaaagaa tatcatctat 180 ggaattaggt aaagagtatg gtatttcagg aagtgttttt aatggtgtat gggaacattt 240 ctatgatgat ggtttttttt ctgaaggcga ggcaacacat tatatagtgc tttgttacac 300 actgaaagtt cttaaaagtg aattgaatct cccagatgat caacatcgtg aatacctttg 360 gctaactaaa caccaaataa atgctaaaca agatgttcat aactat 406 82 292 DNA Escherichia coli 82 gtgtccattt atacggacat ccatgtgata tggaacaaat tgtagaactg gccaaaagta 60 gaaatttgtt tgtaattgaa gattgcgctg aagcctttgg ttctaaatat aaaggtaaat 120 atgtgggaac atttggagat atttctactt ttagcttttt tggaaataaa actattacta 180 caggtgaagg tggaatggtt gtcacgaatg acaaaacact ttatgaccgt tgtttacatt 240 ttaaaggcca aggattagct gtacataggc aatattggca tgacgttata gg 292 83 259 DNA Escherichia coli 83 cggacatcca tgtgatatgg aacaaattgt agaactggcc aaaagtagaa atttgtttgt 60 aattgaagat tgcgctgaag cctttggttc taaatataaa ggtaaatatg tgggaacatt 120 tggagatatt tctactttta gcttttttgg aaataaaact attactacag gtgaaggtgg 180 aatggttgtc acgaatgaca aaacacttta tgaccgttgt ttacatttta aaggccaagg 240 attagctgta cataggcaa 259 84 786 DNA Escherichia coli 84 atccatcagg aggggactgg ataggttatt ttctccatta tgactgcatg gttaatgagc 60 agtgtaataa tggttttata atgtttgaac ctggatatga attaattgtt tccttatttg 120 gatatttggg atttcagaca attattattt ttatagccgc tgtaaatgta attctaatat 180 taaattttgc aaagcatttt gaaaacggaa gttttgttat tgttgcgata atgtgcatgt 240 tcctttggag tgtttatgtt gaggcgatta gacaggctct ggccttatct atagttatat 300 ttgggattca ttctcttttt ttgggtagaa aaaggaaatt tataacatta gtattatttg 360 cgtcaacttt ccatataact gctttgattt gttttcttct aatgactcct ctattttcaa 420 agaaattaag caagataata agttatagcc tattaatttt cagtagcttc tttttcgctt 480 tttctgaaac catattaagt gcactccttg caattttgcc agaaggatcc attgccagtg 540 aaaaattaag tttttactta gcaaccgagc aatacaggcc acagttatct attgggagtg 600 gcactattct tgacattata cttatttttc tgatatgtgt aagttttaaa cgaataaaga 660 aatatatgct cgctaattat aatgctgcaa atgagatatt gcttattggt tgctgtcttt 720 atatttcttt cggtattttt atcgggaaaa tgatgccagt tatgactcgc attggttggt 780 atggtt 786 85 521 DNA Escherichia coli 85 ctaccgtagc gggcgatggt agctggacaa ccaccgtacc cgccgccgat ctcagcgtgt 60 tacgcgacgg cgacgccacc gtgcaggcca gcgtcagcac tattaacggc aacacggctt 120 cggcaaccca cgcctacagc gtcgatgcca cggccccgac gcttgccatt aacaccatcg 180 ccaccgacga tattctgaac gctgccgagg cgggcaatcc gttaaccatc agcggtagca 240 gcaccgccga agcggggcag acggtaaccg tcacgcttaa tggtgtgact tacagcggct 300 ccgtccaggc ggacggcagc tggagcgtca gcttaccgac ggcggatctc agcaatctga 360 ccgccagcca gtacaccgtt agtgcctcgg taagcgataa agcgggtaac ccggcgtccg 420 ctaaccacgg gctggcggtg gatctcaccg tgccggtgct gaccatcaac accgtctccg 480 gcgatgacat tattaacgcc gccgaacacg gacaggcgct g 521 86 408 DNA Escherichia coli 86 ctccggagaa ctgggtgcat cttacccgcg gagatatgaa actgcatatg caggcgaggt 60 ataaggccac acattatccc gtcgccgggg gaaaggcaaa tggacaggta tggttttctc 120 tgacctatct gtaactggca gatataatgc catttaatta aggctgttaa taacatgatg 180 aagcacatgc gtatatgggc cgttctggca tcatttttag tcttttttta tattccgcag 240 agctatgccg gggttgctct gggtgccacc cgtgagattt accctgaagg gcaaaaacag 300 gtacaactgg cggtaacaaa taatgatgat aaaagtagtt accttattca gtcatggatt 360 gaaaatgctg aaggaaaaaa ggatgccagg tttgtaatta ctcctccg 408 87 500 DNA Escherichia coli 87 ccctgacctt gggtgttgcg acaaatgcgt ctgctgtcac cacggttaat ggtggtacag 60 ttcattttaa gggggaagtt gttgatgctg catgtgctgt aaacactaat tcagcaaatc 120 aaacgttttc tgggcaagtt cgttcagcta agttggcgaa tgatggagag aagagttccc 180 ctgttggatt tagtattgaa cttaatgact gtagttctgc aactgccggg catgcatcaa 240 ttatctttgc aggaaatgtt attgctacac acaatgatgt gctgtctcta cagaatagtg 300 ctgcaggtag tgcaacaaat gtaggtattc agatattgga tcatacaggt actgcagttc 360 aatttgacgg agtgactgca tctacacaat ttacattaac agatggcacc aataaaattc 420 ctttccaggc agtttattat gcaacaggta agtcaacgcc tggtattgcc aacgccgacg 480 ccacctttaa agttcagtac 500 88 214 DNA Escherichia coli 88 aagaaatcaa tattatttat ttttctttct gtattgtctt tttcaccttt cgctcaggat 60 gctaaaccag tagagtcttc aaaagaaaaa atcacactag aatcaaaaaa atgtaacatt 120 gcaaaaaaaa gtaataaaag tggtcctgaa agcatgaata gtagcaatta ctgctgtgaa 180 ttgtgttgta atcctgcttg taccgggtgc tatt 214 89 163 DNA Escherichia coli 89 tcccctcttt tagtcagtca actgaatcac ttgactcttc aaaagagaaa attacattag 60 agactaaaaa gtgtgatgtt gtaaaaaaca acagtgaaaa aaaatcagaa aatatgaaca 120 acacatttta ctgctgtgaa ctttgttgta atcctgcctg tgc 163 90 368 DNA Escherichia coli 90 gcaataaggt tgaggtgatt ttatgaaaaa gaatatcgca tttcttcttg catctatgtt 60 cgttttttct attgctacaa atgcctatgc atctacacaa tcaaataaaa aagatctgtg 120 tgaacattat agacaaatag ccaaggaaag ttgtaaaaaa ggttttttag gggttagaga 180 tggtactgct ggagcatgct ttggcgccca aataatggtt gcagcaaaag gatgctaata 240 tatttatcaa tagcattcag caccatatac acaaaaataa tttttcataa aaagaactct 300 ataaaataaa tattttttgt gacaatgtcc taacgcaaga cggacattgt ccatttctca 360 ctgcaggc 368 91 583 DNA Escherichia coli 91 acactggatg atctcagtgg gcgttcttat gtaatgactg ctgaagatgt tgatcttaca 60 ttgaactggg gaaggttgag tagtgtcctg cctgattatc atggacaaga ctctgttcgt 120 gtaggaagaa tttcttttgg aagcattaat gcaattctgg gaagcgtggc attaatactg 180 aattgtcatc atcatgcatc gcgagttgcc agaatggcat ctgatgagtt tccttctatg 240 tgtccggcag atggaagagt ccgtgggatt acgcacaata aaatattgtg ggattcatcc 300 actctggggg caattctgat gcgcagaact attagcagtt gagggggtaa aatgaaaaaa 360 acattattaa tagctgcatc gctttcattt ttttcagcaa gtgcgctggc gacgcctgat 420 tgtgtaactg gaaaggtgga gtatacaaaa tataatgatg acgatacctt tacagttaaa 480 gtgggtgata aagaattatt taccaacaga tggaatcttc agtctcttct tctcagtgcg 540 caaattacgg ggatgactgt aaccattaaa actaatgcct gtc 583 92 1612 DNA Escherichia coli 92 agtcctcgat ggcggtccat tatctgcatt atgcgttgtt agctcagccg gacagagcaa 60 ttgccttctg agcaatcggt cactggttcg aatccagtac aacgcgccat atttatttac 120 caggctcgct tttgcgggcc ttttttatat ctgcgccggg tctggtgctg attacttcag 180 ccaaaaggaa cacctgtata tgaagtgtat attatttaaa tgggtactgt gcctgttact 240 gggtttttct tcggtatcct attcccggga gtttacgata gacttttcga cccaacaaag 300 ttatgtctct tcgttaaata gtatacggac agagatatcg acccctcttg aacatatatc 360 tcaggggacc acatcggtgt ctgttattaa ccacacccca ccgggcagtt attttgctgt 420 ggatatacga gggcttgatg tctatcaggc gcgttttgac catcttcgtc tgattattga 480 gcaaaataat ttatatgtgg ccgggttcgt taatacggca acaaatactt tctaccgttt 540 ttcagatttt acacatatat cagtgcccgg tgtgacaacg gtttccatga caacggacag 600 cagttatacc actctgcaac gtgtcgcagc gctggaacgt tccggaatgc aaatcagtcg 660 tcactcactg gtttcatcat atctggcgtt aatggagttc agtggtaata caatgaccag 720 agatgcatcc agagcagttc tgcgttttgt cactgtcaca gcagaagcct tacgcttcag 780 gcagatacag agagaatttc gtcaggcact gtctgaaact gctcctgtgt atacgatgac 840 gccgggagac gtggacctca ctctgaactg ggggcgaatc agcaatgtgc ttccggagta 900 tcggggagag gatggtgtca gagtggggag aatatccttt aataatatat cagcgatact 960 ggggactgtg gccgttatac tgaattgcca tcatcagggg gcgcgttctg ttcgcgccgt 1020 gaatgaagag agtcaaccag aatgtcagat aactggcgac aggcctgtta taaaaataaa 1080 caatacatta tgggaaagta atacagctgc agcgtttctg aacagaaagt cacagttttt 1140 atatacaacg ggtaaataaa ggagttaagc atgaagaaga tgtttatggc ggttttattt 1200 gcattagctt ctgttaatgc aatggcggcg gattgtgcta aaggtaaaat tgagttttcc 1260 aagtataatg aggatgacac atttacagtg aaggttgacg ggaaagaata ctggaccagt 1320 cgctggaatc tgcaaccgtt actgcaaagt gctcagttga caggaatgac tgtcacaatc 1380 aaatccagta cctgtgaatc aggctccgga tttgctgaag tgcagtttaa taatgactga 1440 ggcataacct gattcgtggt atgtgggtaa caagtgtaat ctgtgtcaca attcagtcag 1500 ttgacagttg cctgtcagac tgagcatttg ttaaaaaaat ttcgcatggt gaatccccct 1560 gtgtggaggg gcgactggtg aaaaatcctt gcttgtgatt cattatcgac ac 1612 93 502 DNA Escherichia coli 93 gcgaaggaat ttaccttaga cttctcgact gcaaagacgt atgtagattc gctgaatgtc 60 attcgctctg caataggtac tccattacag actatttcat caggaggtac gtctttactg 120 atgattgata gtggctcagg ggataatttg tttgcagttg atgtcagagg gatagatcca 180 gaggaagggc ggtttaataa tctacggctt attgttgaac gaaataattt atatgtgaca 240 ggatttgtta acaggacaaa taatgttttt tatcgctttg ctgatttttc acatgttacc 300 tttccaggta caacagcggt tacattgtct ggtgacagta gctataccac gttacagcgt 360 gttgcaggga tcagtcgtac ggggatgcag ataaatcgcc attcgttgac tacttcttat 420 ctggatttaa tgtcgcatag tggaacctca ctgacgcagt ctgtggcaag agcgatgtta 480 cggtttgtta ctgtgacagc tg 502 94 482 DNA Escherichia coli 94 cttgaacata tatctcaggg gaccacatcg gtgtctgtta ttaaccacac cccaccgggc 60 agttattttg ctgtggatat acgagggctt gatgtctatc aggcgcgttt tgaccatctt 120 cgtctgatta ttgagcaaaa taatttatat gtggccgggt tcgttaatac ggcaacaaat 180 actttctacc gtttttcaga ttttacacat atatcagtgc ccggtgtgac aacggtttcc 240 atgacaacgg acagcagtta taccactctg caacgtgtcg cagcgctgga acgttccgga 300 atgcaaatca gtcgtcactc actggtttca tcatatctgg cgttaatgga gttcagtggt 360 aatacaatga ccagagatgc atccagagca gttctgcgtt ttgtcactgt cacagcagaa 420 gccttacgct tcaggcagat acagagagaa tttcgtcagg cactgtctga aactgctcct 480 gt 482 95 151 DNA Escherichia coli 95 ggtggagtat acaaaatata atgatgacga tacctttaca gttaaagtgg gtgataaaga 60 attatttacc aacagatgga atcttcagtc tcttcttctc agtgcgcaaa ttacggggat 120 gactgtaacc attaaaacta atgcctgtca t 151 96 211 DNA Escherichia coli 96 ttctgttaat gcaatggcgg cggattgtgc taaaggtaaa attgagtttt ccaagtataa 60 tgaggatgac acatttacag tgaaggttga cgggaaagaa tactggacca gtcgctggaa 120 tctgcaaccg ttactgcaaa gtgctcagtt gacaggaatg actgtcacaa tcaaatccag 180 tacctgtgaa tcaggctccg gatttgctga a 211 97 226 DNA Escherichia coli 97 gaagaagatg tttatagcgg ttttatttgc attggtttct gttaatgcaa tggcggcgga 60 ttgtgctaaa ggtaaaattg agttttccaa gtataatgag gataatacct ttactgtgaa 120 ggtgtcagga agagaatact ggacgaacag atggaatttg cagccattgt tacaaagtgc 180 tcagctgaca gggatgactg taacaatcat atctaatacc tgcagt 226 98 442 DNA Escherichia coli 98 attggtgccg gtgttactgc tgctcttcat cggaaaaacc aaccggcaga acaaacaatc 60 actacacgta cggtagtcga taatcagcct acgaataacg catctgcgca gggcaatact 120 gacacaagtg ggccagaaga gtccccggcg agcagacgta attcgaatgc cagcctcgca 180 tcgaacgggt ctgacacctc cagcacgggc acggtagaga atccgtatgc tgacgttgga 240 atgcccagaa atgattcact ggctcgcatt tcagaggaac ctatttatga tgaggtcgct 300 gcagatccta attatagcgt cattcaacat ttttcaggga acagcccagt taccggaagg 360 ttagtgggaa ccccagggca aggtatccaa agtacttatg cgcttctggc aagcagcggc 420 ggattgcgtt taggtatggg ag 442 99 1521 DNA Escherichia coli 99 atgcctattg gtaatcttgg tcataatccc aatgtgaata attcaattcc tcctgcacct 60 ccattacctt cacaaaccga cggtgcaggg gggcgtggtc agctcattaa ctctacgggg 120 ccgttgggat ctcgtgcgct atttacgcct gtaaggaatt ctatggctga ttctggcgac 180 aatcgtgcca gtgatgttcc tggacttcct gtaaatccga tgcgcctggc ggcgtctgag 240 ataacactga atgatggatt tgaagttctt catgatcatg gtccgctcga tactcttaac 300 aggcagattg gctcttcggt atttcgagtt gaaactcagg aagatggtaa acatattgct 360 gtcggtcaga ggaatggtgt tgagacctct gttgttttaa gtgatcaaga gtacgctcgc 420 ttgcagtcca ttgatcctga aggtaaagac aaatttgtat ttactggagg ccgtggtggt 480 gctgggcatg ctatggtcac cgttgcttca gatatcacgg aagcccgcca aaggatactg 540 gagctgttag agcccaaagg gaccggggag tccaaaggtg ctggggagtc aaaaggcgtt 600 ggggagttga gggagtcaaa tagcggtgcg gaaaacacca cagaaactca gacctcaacc 660 tcaacttcca gccttcgttc agatcctaaa ctttggttgg cgttggggac tgttgctaca 720 ggtctgatag ggttggcggc gacgggtatt gtacaggcgc ttgcattgac gccggagccg 780 gatagcccaa ccacgaccga ccctgatgca gctgcaagtg aaactgaaac tgcgacaaga 840 gatcagttaa cgaaagaagc gttccagaac ccagataatc aaaaagttaa tatcgatgag 900 ctcggaaatg cgattccgtc aggggtattg aaagatgatg ttgttgcgaa tatagaagag 960 caggctaaag cagcaggcga agaggccaaa cagcaagcca ttgaaaataa tgctcaggcg 1020 caaaaaaaat atgatgaaca acaagctaaa cgccaggagg agctgaaagt ttcatcgggg 1080 gctggctacg gtcttagtgg cgcattgatt cttggtgggg gaattggtgt tgccgtcacc 1140 gctgcgcttc atcgaaaaaa tcagccggta gaacaaacaa caacaacaac tactacaact 1200 acaactacaa gcgcacgtac ggtagagaat aagcctgcaa ataatacacc tgcacagggc 1260 aatgtagata cccctgggtc agaagatacc atggagagca gacgtagctc gatggctagc 1320 accttgtcga ctttctttga cacttccagc atagggaccg tgcagaatcc gtatgctgat 1380 gttaaaacat cgctgcatga ttcgcaggtg ccgacttcta attctaatac gtctgttcag 1440 aatatgggga atacagattc tgttgtatat agcaccattc aacatcctcc ccgggatact 1500 actgataacg gcgcacggtt a 1521 100 446 DNA Escherichia coli 100 attggtgctg gtgtaacgac tgcgctccat agacgaaatc agccggcaga acagacaact 60 actacaacaa cacatacggt agtgcagcag cagaccggag ggaatacccc agcacaaggt 120 ggcactgatg ccacaagagc agaagatgct tctctgaata gacgtgattc gcaggggagt 180 gttgcatcga cacactggtc agattcctct agcgaagtgg ttaatccata tgctgaagtt 240 ggggagcctc ggaatagtct atcgactcgt cagcaagaag agcatattta cgatgaggtc 300 gctgcagatc ctgtttatag cgtcattcag aatttttcac ggaatgctcc agttaccgga 360 aggttaatgg gaagcccagg gcaaggtatc caaagtactt atgcgcttct ggcaaacagc 420 gctggattgc gtttaggtat gggagg 446 101 288 DNA Escherichia coli 101 ggtgtggtgc gatgagcaca gcaatcaaga agcgtaacct tgaggtgaag actcagatga 60 gtgagaccat ctggcttgaa cccgccagcg aacgcacggt atttctgcag atcaaaaaca 120 cgtctgataa agacatgagt gggctgcagg gcaaaattgc tgatgctgtg aaagcaaaag 180 gatatcaggt ggtgacttct ccggataaag cctactactg gattcaggcg aatgtgctga 240 aggccgataa gatggatctg cgggagtctc agggatggct gaaccgtg 288 102 640 DNA Escherichia coli 102 ggtggtgcac tggagtggag ctttaacagc agtaccggag ctggtgcgct gacacaggga 60 accaccacat atgccatgca cgggcagcag ggaaatgacc tgaatgctgg taagaacctg 120 atatttcagg ggcagaatgg tcagattaac cttaaggatt cggtttctca gggggcgggt 180 tccctgacgt tccgtgataa ttacacagta acaacctcta acggaagtac ctggaccggt 240 gccggtattg ttgtggacaa cggggtgtcc gtaaactggc aggttaatgg tgttaagggc 300 gataacctgc ataaaattgg tgaaggtacg ctgacggtac agggtacagg tattaatgaa 360 ggtggcctga aggtcgggga cggaaaggtt gtactgaacc agcaggcgga caataaagga 420 caggtgcagg cgttcagcag tgttaatatt gccagtggcc ggccgaccgt ggtactgact 480 gatgagcggc aggtaaatcc ggataccgtc tcatggggat atcgtggggg cacactggat 540 gttaatggta acagtctgac gtttcatcag ttgaaggcgg cagattatgg tgccgtgctg 600 gcgaataacg ttgataaacg ggccactatc acgctggact 640 103 250 DNA Escherichia coli 103 gcgaaaactg tggaattgat cagcgttggt gggaaagcgc gttacaagaa agccgggcaa 60 ttgctgtgcc aggcagtttt aacgatcagt tcgccgatgc agatattcgt aattatgcgg 120 gcaacgtctg gtatcagcgc gaagtcttta taccgaaagg ttgggcaggc cagcgtatcg 180 tgctgcgttt cgatgcggtc actcattacg gcaaagtgtg ggtcaataat caggaagtga 240 tggagcatca 250 104 501 DNA Escherichia coli 104 ctactgttcc cgagtagtgt gttggcgact caaatatggg gaaaatggtc gctcagtggc 60 gtactcagtg caacccgcgg ctcttacatc ggtgcgttgg catctgcttt gtatattccc 120 tctgcgggcg agggcagtgc tcgcgtgccc ggacgtgatg agttctggta tgaggaagaa 180 ctgcggcaga aagcactagc aggcagtacc gccaccaccc gggtacgttt tttctgggga 240 actgacattc acggcaagcc tcaggtgtat ggtgttcata cgggtgaagg tacgccgtat 300 gaaaacgtcc gcgtggcgaa catgcagtgg aacgagcaga cgcagcgtta tgaatttacc 360 cccgctcacg atgtcgatgg ccccctgatt acctggacgc cggaaaatcc ggaacatggg 420 aatgttccgg gccataccgg taacgacagg ccgccgctgg atcagcccac cattctggtg 480 acgccgattc cggacggcac c 501 105 22 DNA Artificial sequence Artificial Sequence = Primer 105 gcgatcatgg ccgcgaccag ca 22 106 22 DNA Artificial sequence Artificial Sequence = Primer 106 caactcaccc agtagcccca gt 22 107 22 DNA Artificial sequence Artificial Sequence = Primer 107 gaaagtaaat ggaatataaa tg 22 108 23 DNA Artificial sequence Artificial Sequence = Primer 108 tttgtgttgc cgccgctggt gaa 23 109 22 DNA Artificial sequence Artificial Sequence = Primer 109 gaaagtaaat ggaatataaa tg 22 110 23 DNA Artificial sequence Artificial Sequence = Primer 110 tttgtgtcgg tgcagcaggg aaa 23 111 21 DNA Artificial sequence Artificial Sequence = Primer 111 ggtgcaatgg ctctgaccac a 21 112 21 DNA Artificial sequence Artificial Sequence = Primer 112 gtcattacaa gagatactac t 21 113 21 DNA Artificial sequence Artificial Sequence = Primer 113 gctcacacca tcaacaccgt t 21 114 21 DNA Artificial sequence Artificial Sequence = Primer 114 cgttgactta gtcaggataa t 21 115 21 DNA Artificial sequence Artificial Sequence = Primer 115 gggcccactc taaccaaaga a 21 116 21 DNA Artificial sequence Artificial Sequence = Primer 116 cggtaattac ctgaaactaa a 21 117 21 DNA Artificial sequence Artificial Sequence = Primer 117 cgtgtgggag ccctgagcct t 21 118 20 DNA Artificial sequence Artificial Sequence = Primer 118 ccggcctggt tgctagtatt 20 119 21 DNA Artificial sequence Artificial Sequence = Primer 119 catcagttgc tagtgcgaat g 21 120 20 DNA Artificial sequence Artificial Sequence = Primer 120 cagcaaatgt caaatacgtt 20 121 21 DNA Artificial sequence Artificial Sequence = Primer 121 cgacatcgac gatctatgac t 21 122 21 DNA Artificial sequence Artificial Sequence = Primer 122 ccaagggata ttgctgaaat a 21 123 21 DNA Artificial sequence Artificial Sequence = Primer 123 catcagttgc tagtgcgaat g 21 124 20 DNA Artificial sequence Artificial Sequence = Primer 124 cagcaaatgt caaatacgtt 20 125 21 DNA Artificial sequence Artificial Sequence = Primer 125 cggagagtac gaccggcgct t 21 126 21 DNA Artificial sequence Artificial Sequence = Primer 126 gcacggctgg ctgctttcgt t 21 127 21 DNA Artificial sequence Artificial Sequence = Primer 127 gctgccatta atagcgcaac t 21 128 21 DNA Artificial sequence Artificial Sequence = Primer 128 tattgttgtt accagccttg c 21 129 21 DNA Artificial sequence Artificial Sequence = Primer 129 gtaatgacgg ttaattctgt t 21 130 21 DNA Artificial sequence Artificial Sequence = Primer 130 gccgcatcaa tagccttaga a 21 131 20 DNA Artificial sequence Artificial Sequence = Primer 131 cccataacgg aacaactcat 20 132 22 DNA Artificial sequence Artificial Sequence = Primer 132 cagaatagac caaacatctg ca 22 133 21 DNA Artificial sequence Artificial Sequence = Primer 133 ggccactttc aatgttggtc a 21 134 22 DNA Artificial sequence Artificial Sequence = Primer 134 cgactgcacc tgttcctgat ta 22 135 21 DNA Artificial sequence Artificial Sequence = Primer 135 tctgatatag tttatatggg t 21 136 21 DNA Artificial sequence Artificial Sequence = Primer 136 tcaaacccca ctcttaatta a 21 137 21 DNA Artificial sequence Artificial Sequence = Primer 137 ttgcaaaagc aattttgcaa c 21 138 21 DNA Artificial sequence Artificial Sequence = Primer 138 tgccgaacaa tgttctctgc a 21 139 21 DNA Artificial sequence Artificial Sequence = Primer 139 aattgtttta aaatctgttc t 21 140 21 DNA Artificial sequence Artificial Sequence = Primer 140 tgagactgaa attacattta a 21 141 21 DNA Artificial sequence Artificial Sequence = Primer 141 gaacaattca aacagttcag t 21 142 21 DNA Artificial sequence Artificial Sequence = Primer 142 ttattcaaat cgcgcaatac c 21 143 20 DNA Artificial sequence Artificial Sequence = Primer 143 caaatacagt ccgcgtacga 20 144 21 DNA Artificial sequence Artificial Sequence = Primer 144 ccattgttac ctaaagagcg t 21 145 21 DNA Artificial sequence Artificial Sequence = Primer 145 ttggcagtta caggaatgca t 21 146 21 DNA Artificial sequence Artificial Sequence = Primer 146 aacagtgaac catattttag t 21 147 20 DNA Artificial sequence Artificial Sequence = Primer 147 atgaggaaca taatggcagg 20 148 20 DNA Artificial sequence Artificial Sequence = Primer 148 tctggtcagg tcgtcaatac 20 149 21 DNA Artificial sequence Artificial Sequence = Primer 149 ggtgatcgat tattccgctg a 21 150 21 DNA Artificial sequence Artificial Sequence = Primer 150 acgcctcatc ggtcagcgcc t 21 151 21 DNA Artificial sequence Artificial Sequence = Primer 151 tctgcacgtt taaaattatt g 21 152 21 DNA Artificial sequence Artificial Sequence = Primer 152 gtttctccgt cagaatcaag c 21 153 21 DNA Artificial sequence Artificial Sequence = Primer 153 ctaccgtagc gggcgatggt a 21 154 21 DNA Artificial sequence Artificial Sequence = Primer 154 cagcgcctgt ccgtgttcgg c 21 155 21 DNA Artificial sequence Artificial Sequence = Primer 155 ccctgacctt gggtgttgcg a 21 156 20 DNA Artificial sequence Artificial Sequence = Primer 156 gtactgaact ttaaaggtgg 20 157 20 DNA Artificial sequence Artificial Sequence = Primer 157 aagaaatcaa tattatttat 20 158 18 DNA Artificial sequence Artificial Sequence = Primer 158 aatagcaccc ggtacaag 18 159 20 DNA Artificial sequence Artificial Sequence = Primer 159 gcgaaggaat ttaccttaga 20 160 20 DNA Artificial sequence Artificial Sequence = Primer 160 cagctgtcac agtaacaaac 20 161 20 DNA Artificial sequence Artificial Sequence = Primer 161 cttgaacata tatctcaggg 20 162 21 DNA Artificial sequence Artificial Sequence = Primer 162 acaggagcag tttcagacag t 21 163 21 DNA Artificial sequence Artificial Sequence = Primer 163 ggtggagtat acaaaatata a 21 164 21 DNA Artificial sequence Artificial Sequence = Primer 164 atgacaggca ttagttttaa t 21 165 20 DNA Artificial sequence Artificial Sequence = Primer 165 ttctgttaat gcaatggcgg 20 166 21 DNA Artificial sequence Artificial Sequence = Primer 166 ttcagcaaat ccggagcctg a 21 167 20 DNA Artificial sequence Artificial Sequence = Primer 167 gaagaagatg tttatagcgg 20 168 21 DNA Artificial sequence Artificial Sequence = Primer 168 actgcaggta ttagatatga t 21 169 22 DNA Artificial sequence Artificial Sequence = Primer 169 attggtgccg gtgttactgc tg 22 170 20 DNA Artificial sequence Artificial Sequence = Primer 170 ctcccatacc taaacgcaat 20 171 21 DNA Artificial sequence Artificial Sequence = Primer 171 attggtgttg ccgtcaccgc t 21 172 18 DNA Artificial sequence Artificial Sequence = Primer 172 acgccatgac atgggagg 18 173 21 DNA Artificial sequence Artificial Sequence = Primer 173 attggtgctg gtgtaacgac t 21 174 18 DNA Artificial sequence Artificial Sequence = Primer 174 attgcgttta ggtatggg 18 175 21 DNA Artificial sequence Artificial Sequence = Primer 175 ctactgttcc cgagtagtgt g 21 176 21 DNA Artificial sequence Artificial Sequence = Primer 176 ggtgccgtcc ggaatcggcg t 21 

What is claimed is:
 1. An array comprising: (a) a substrate; and (b) a plurality of nucleic acid probes, each of said probes being bound to said substrate at a discrete location; said plurality of probes comprising a first probe for a first pathotype of a species of a microorganism and a second probe for a second pathotype of said species, wherein said first and second pathotypes are not identical.
 2. The array of claim 1, comprising at least two probes for a single pathotype, wherein said two probes are not identical.
 3. The array of claim 2 wherein said array comprises a subarray, wherein said subarray comprises said at least two probes at adjacent discrete locations on said substrate.
 4. The array of claim 1 wherein said probe is for a virulence gene or fragment thereof or a sequence substantially identical thereto, wherein said virulence gene is associated with pathogenicity of said microorganism.
 5. The array of claim 1, wherein said microorganism is a bacterium.
 6. The array of claim 5, wherein said bacterium is of the family Enterobacteriaceae.
 7. The array of claim 6, wherein said bacterium is E. coli.
 8. The array of claim 7, wherein said first and second pathotypes each independently comprise a pathotype selected from the group consisting of: (a) enterotoxigenic E. coli (ETEC); (b) enteropathogenic E. coli (EPEC); (c) enterohemorrhagic E. coli (EHEC); (d) enteroaggregative E. coli (EAEC); (e) enteroinvasive E. coli (EIEC); (f) uropathogenic strains (UPEC); (g) E. coli strains involved in neonatal meningitis (MENEC); (h) E. coli strains involved in septicemia (SEPEC); (i) cell-detaching E. coli (CDEC); and (j) diffusely adherent E. coli (DAEC).
 9. The array of claim 7, wherein said first pathotype is selected from the group consisting of: (a) enteroaggregative E. coli (EAEC); (b) enteroinvasive E. coli (EIEC); (c) E. coli strains involved in neonatal meningitis (MENEC); (d) E. coli strains involved in septicemia (SEPEC); (e) cell-detaching E. coli (CDEC); and (f) diffusely adherent E. coli (DAEC).
 10. The array of claim 4, wherein said virulence gene encodes a polypeptide of a class of proteins selected from the group consisting of toxins, adhesion factors, secretory system proteins, capsule antigens, somatic antigens, flagellar antigens, invasins, autotransporter proteins, and aerobactin system proteins.
 11. The array of claim 4, wherein said virulence gene is selected from the group consisting of afaBC3, afaE5, afaE7, afaD8, aggA, aggC, aida, bfpA, bmaE, cdt1, cdt2, cdt3, cfaI, clpG, cnf1, cnf2, cs1, cs3, cs31a, cvaC, derb122, eae, eaf, east1, ehxA, espA group I, espA group II, espA group III, espB group I, espB group II, espB group III, espC, espP, etpD,F17A, F17G, F18, F4, F41, F5, F6, fimA group I, fimA group II, fimH, fliC, focG, fyuA, hlyA, hlyC, ibe10, iha, invX, ipaC, iroN, irp1, irp2, iss, iucD, iutA, katP, kfiB, kpsMTII, kpsMTIII, 17095, leoA, lngA, lt, neuC, nfaE, ompA, ompT, paa, papAH, papC, papEF, papG group I, papG group II, papG group III, pai, rfbO9, rfbO101, rfbO111, rfbE O157, rfbE O157 H7, rfc O4, rtx, sfaDE, sfaA, stah, stap, stb, stx1, stx2, stxA I, stxA II, stxB I, stx B II, stxB III, tir group I, tir group II, tir group III, traT, and tsh.
 12. The array of claim 1 wherein said probe comprises a nucleic acid sequence selected from the group consisting of SEQ ID NO:1 to SEQ ID NO:102, or a fragment thereof, or a sequence substantially identical thereto.
 13. A method of detecting the presence of a microorganism in a sample, said method comprising: (a) contacting the array of claim 1 with a sample nucleic acid of said sample; and (b) detecting association of said sample nucleic acid to a probe on said array; wherein association of said sample nucleic acid with said probe is indicative that said sample comprises a microorganism from which the nucleic acid sequence of said probe is derived.
 14. The method of claim 13, wherein said method further comprises extracting said sample nucleic acid from said sample prior to contacting it with said array.
 15. The method of claim 13, wherein said sample nucleic acid is not amplified by PCR prior to contacting it with said array.
 16. The method of claim 13, wherein said method further comprises digesting said sample nucleic acid with a restriction endonuclease to produce fragments of said sample nucleic acid.
 17. The method of claim 16, wherein said fragments are of an average size of about 0.2 Kb to about 12 Kb.
 18. The method of claim 13, wherein said sample is selected from the group consisting of environmental samples, biological samples and food.
 19. The method of claim 18 wherein said environmental samples are selected from the group consisting of water, air and soil.
 20. The method of claim 18 wherein said biological samples are selected from the group consisting of blood, urine, amniotic fluid, feces, tissues, cells, cell cultures and biological secretions, excretions and discharge.
 21. The method of claim 13, wherein said method is further for determining a pathotype of a species of said microorganism, wherein said probe is for a pathotype of said species and wherein association of said sample nucleic acid with said probe is indicative that said microorganism is of said pathotype.
 22. The method of claim 13, wherein said sample is a tissue, body fluid, secretion or excretion from a subject and said method is further for diagnosing an infection by said microorganism in said subject, wherein association of said nucleic acid with said probe is indicative that said subject is infected by said microorganism.
 23. The method of claim 22, wherein said method is for diagnosing a condition related to infection by said microorganism in said subject, wherein said probe is for a pathotype of said species and wherein association of said sample nucleic acid with said probe is indicative that said microorganism is of said pathotype and that said subject suffers from a condition associated with said pathotype.
 24. The method of claim 23, wherein said condition is selected from the group consisting of: diarrhea, hemorrhagic colitis, hemolytic uremic syndrome, invasive intestinal infections, dysentery, urinary tract infections, neonatal meningitis and septicemia.
 25. The method of claim 22, wherein said subject is a mammal.
 26. The method of claim 22, wherein said subject is a human.
 27. A commercial package comprising the array of claim 1 together with instructions for: (a) detecting the presence of a microorganism in a sample; (b) determining the pathotype of a microorganism in a sample; (c) diagnosing an infection by a microorganism in a subject; (d) diagnosing a condition related to infection by a microorganism, in a subject; or (e) any combination of (a) to (d).
 28. A method of producing an array for pathotyping a microorganism in a sample, said method comprising: (a) providing a plurality of nucleic acid probes, said plurality of probes comprising a first probe for a first pathotype of a species of said microorganism and a second probe for a second pathotype of said species, wherein said first and second probes are different; and (b) applying each of said plurality of probes to a different discrete location of a substrate.
 29. A method of producing an array for pathotyping a microorganism in a sample, said method comprising: (a) selecting a plurality of nucleic acid probes, said plurality of probes comprising a first probe for a first pathotype of a species of said microorganism and a second probe for a second pathotype of said species, wherein said first and second probes are different; and (b) synthesizing each of said plurality of probes at a different discrete location of a substrate. 